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NON-ENDOGENOUS, CONSTITUTIVELY ACTIVATED 
HUMAN G PROTEIN-COUPLED RECEPTORS 

This patent application is a continuation-in-part of, and claims priority from, U.S. 

Serial Number 09/1 70,496, filed with the United States Patent and Trademark Office on 

5 October 13, 1998. This application also claims the benefit of priority from the following 

provisional applications, all filed via U.S. Express Mail with the United States Patent and 

Trademark Office on the indicated dates: U.S. Provisional Number 60/1 1 0,060, filed 

November 27, 1998; U.S. Provisional Number 60/120,416, filed February 16, 1999; U.S. 

Provisional Number 60/121,852, filed February 26, 1999 claiming benefit of U.S. 

10 Provisional Number 60/1 09,2 1 3, filed November 20, 1 998; U.S. Provisional Number 

60/123,944, filed March 12, 1999; U.S. Provisional Number 60/123,945, filed March 12,- 
1999; U.S. Provisional Number 60/123,948, filed March 12, 1999; U.S. Provisional 
Number 60/123,951, filed March 12, 1999; U.S. Provisional Number 60/123,946, filed 
March 12, 1999; U.S. Provisional Number 60/123,949, filed March 12, 1999; U.S. 

15 Provisional Number 60/1 52,524, filed September 3, 1999, claiming benefit of U.S. 

Provisional Number 60/15 1,1 14, filed August 27, 1999 and U.S. Provisional Number 

i . ■ . ■ ■ • - ■ . 

60/108,029, filed November 12, 1998; U.S. Provisional Number 60/136,436, filed May 28, 
1999; U.S. Provisional Number 60/136,439, filed May 28, 1999; U.S. Provisional Number 
60/136,567, filed May 28, 1999; U.S. Provisional Number 60/137,127, filed May 28, 
20 1999; U.S. Provisional Number 60/137,131, filed May 28, 1999; U.S. Provisional Number 
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60/141,448, filed June 29, 1999 claiming benefit of U.S. Provisional Number 60/136,437, 
filed May 28, 1 999; U.S; Provisional Number 60/1 56,633, filed September 29, 1999; U.S. 
Provisional Number 60/156,555, filed September 29, 1999; U.S. Provisional Number 
60/156,634, filed September 29, 1999;U.S. Provisional Number ^_(Arena 
'PnarmaceuticalsVlncrdbcket number -CHttW^^^ U.S. ; ^ v v 

Provisional Number (Arena Pharmaceuticals, Inc. docket number: RUP6-1), filed 

October 1, 1999; U.S. Provisional Number m ^(Arena Pharmaceuticals, Inc. docket 

number: RUP7-1),, filed October 1,1999; UiS. ■ Provisional Number __(Arena 
Pharmaceuticals, Inc. docket number: CHN6-1), filed October 1, 1999; U.S: Provisiohal ; 
Number (Arena Pharmaceuticals, Inc. docket number: RUP5-1), filed October 1, 1999; 
and U.S. Provisional Number (Arena Pharmaceuticals, Inc. docket number: CHN9-1), 
filed October 1, 1999. This application is also related to co-pending U.S. Serial Number 
■ . ■ (Woodcock, Washburn, Kurtz, Makiewicz & Norris, LLP docket number AREN- 
0050), filed on October 12, 1999 (via U.S. Express Mail) and U.S. Serial Number 
09/364,425, filed on July 30^ 1 999, both incorporated herein by reference. This 

application also claims priority to U.S. Serial Number (Woodcock, Washburn, 

Kurtz, Makiewicz & Norris, LLP docket number AREN-0054), filed on October 12, 1999 
(via U.S. Express Mail), incorporated by reference herein in its entirety. Each of the 
foregoing applications are incorporated by reference herein in their entirety. , 

FIELD OF THE INVENTION 

The invention disclosed in this patent document relates to transmembrane 
receptors, and more particularly, to human G protein-coupled receptors, and specifically to 
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GPCRs that have been altered to establish or enhance constitutive activity of the receptor. 
Preferably, the altered GPCRs are used for the direct identification of candidate compounds 
as receptor agonists, inverse agonists or partial agonists having potential applicability as 
therapeutic agents. 

5 BACKGROUND OF THE INVENTION 

Although a number of receptor classes exist in humans, by far the most 
abundant anid therapeutically relevant is represented by the G protein-coupled receptor (GPCR 
or GPCRs) class. It is estimated that there are some 1 00,000 genes within the human genome, 
and of these, approximately 2%, or 2,000 genes, are estimated to code for GPCRs. Receptors, 

10 including GPCRs, for which the endogenous ligand has been identified are referred to as 
"known" receptors, while receptors for which the endogenous ligand has not been identified 
* are referred to as "orphan" receptors. GPCRs represent an important area for the development 
of pharmaceutical products: from approximately 20 of the 100 known GPCRs, 60%. of all 
prescription pharmaceuticals have been developed. 

15 GPCRs share a common structural motif. All these receptors have seven 

sequences of between 22 to 24 hydrophobic amino acids that form seven alpha helices, each 
of which spans the membrane (each span is identified by number, /. e., transmembrane- 1 (TM- 
1 ), transmebrane-2 (TM-2), etc.). The transmembrane helices are joined by strands of amino 
acids between transmembrane-2 and transmembrane-3 , transmembrane-4 and transmembrane- 

20 5, and transmembrane-6 and transmembrane-7 on the exterior, or "extracellular" side, of the 
cell membrane (these are referred to as "extracellular" regions 1 , 2 and 3 (EC- 1 , EC-2 and EC- 
3), respectively). The transmembrane helices are also joined by strands of amino acids 
between transmembrane- 1 and transmembrane-2, transmembrane-3 and transmembrane-4, and 
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transmembrane-5 and transmerhbrarie-6 on the interior, or "intracellular" side, of the cell 
membrane (these are referred to as "intracellular" regions f, 2 and 3 (IC-1, IC-2 and IC-3), 
respectively). The "carboxy" ("C") terminus of the receptor lies in the intracellular space 
within the cell, and the "amino" ("N") terminus of the receptor lies iri the extracellular space 
' "/<" ; 5 outside of the cell. 

Generally, when an endogenous ligand binds with the receptor (often referred 
to as "activation" of the receptor), there is a change in the conformation of the intracellular 
region that allows for coupling between the intracellular region and an intracellular "G- 
protein." It has been reported that GPCRs are "promiscuous" with respect to G proteins, i.e., 
10 that a GPCR can interact with more than one G protein. See, Kenakin, T., 43 Life Sciences 
1 095 (1988). Although other G proteins exist, currently, Gq, Gs, Gi, Gz and Go are G proteins, 
that have been identified. Endogenous ligand-activated GPGR coupling with the G-protein 
begins a signaling cascade process (referred to as "signal transduction"). Under normal 
conditions, signal transduction ultimately results in cellular activation or cellular inhibition. 
.15 It is thought that the IC-3 loop as well as the carboxy terminus of the receptor interact with 
the G protein. 

Under physiological conditions, GPCRs -exist in the cell membrane, in 
equilibrium between two different conformations: an "inactive" state and an "active" state. 
A receptor in an inactive state is unable to link to the intracellular signaling transduction 
20 pathway to produce a biological response. Changing the receptor conformation to the active 
state allows linkage to the transduction pathway (via the G-protein) and produces a biological 
■ response. • 

" A receptor may be stabilized in an active state by an endogenous ligand or a 7 
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compound such as a drug. Recent discoveries, including but not exclusively limited to 
modifications to the amino acid sequence of the receptor, provide means other than 
endogenous ligands or drugs to promote and stabilize the receptor in the active state 
conformation; These. means effectively stabilize the receptor in an active state by 
5 simulating the effect of an endogenous ligand binding to the receptor. Stabilization by 
such ligand-independent means is termed "constitutive receptor activation." 

SUMMARY OF THE INVENTION 
. Disclosed herein are non-endogenous versions of endogenous, human GPCRs and 
uses thereof. 

10 BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 is a representation of 8XCRE-Luc reporter plasmid (see, Example 
4(c)3.) \. - ; 

Figures 2 A and 2B are graphic representations of the results of ATP and ADP 
binding to endogenous TDAG8 (2A) and comparisons in serum and serum free media (2B). 
15 Figure 3 is a graphic representation of the comparative signaling results of 

CMV versus the GPCR Fusion Protein H9(F236K):Gsa. r 

DETAILED DESCRIPTION 
/ The scientific literature that has evolved around receptors has adopted a 
number of terms to refer to ligands having various effects on receptors. For clarity and 
20 consistency, the following definitions will be used throughout this patent document. To the 
extent that these definitions conflict with other definitions for these terms, the following 
definitions shall control: 

AGONISTS shall mean materials (e.g., ligands, candidate compounds) that 
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activate the intracellular response when they bind to the receptor, or enhance GTP binding to 
■•' membranes. :■' ^ . • 

AMINO ACID ABBREVIATIONS used herein are set out in Table A: 
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25 PARTIAL AGONISTS shall mean materials (e.g., ligands, candidate compounds) 

that activate the intracellular response when they bind to the receptor to a lesser degree/extent 
than do agonists, or enhance GTP binding to membranes Jo a lesser degree/extent than do 
agonists. •.. . ; • . ' "• • '• " « ... "* * ' 

. ANTAGONIST shall mean materials (e.g., ligands, candidate compounds) that 
* 30 competitively bind to the receptor at the.same site as the agonists but which do not activate 
the intracellular response initiated by the active form of the receptor, and can thereby inhibit 
the intracellular responses by agonists or partial agonists. ANTAGONISTS do not diminish 
the baseline intracellular response in the absence of an agonist or partial agonist. 

CANDIDATE COMPOUND shall mean a molecule (for example, and not limitation, 
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a chemical compound) that is amenable to a screening technique. Preferably, the phrase 
"candidate compound" does not include compounds which were publicly, known to be 
compounds selected from the group consisting of inverse agonist, agonist or antagonist to a 
receptor, as previously determined by an indirect identification process ("indirectly identified , 
5 compound"); more preferably, not including an indirectly identified compound which has 
previously been determined to have therapeutic efficacy in at least one mammal; and, most 
preferably, not including ^an indirectly identified compound which has previously been 
determined to have therapeutic utility in humans. 

COMPOSITION means a material comprising at least one component; a 
10 "pharmaceutical composition" is an example of a composition. 

COMPOUND EFFICACY shall mean a measurement of the ability of a compound 
to inhibit or stimulate receptor functionality, as opposed to receptor binding affinity. 
Exemplary means of detecting compound efficacy are disclosed in the Example section of this 
patent document 

15 CODON shall mean a grouping of three nucleotides (or equivalents to nucleotides) 

which generally comprise a nucleoside (adenosine (A), guanosine (G), cytidine (C), uridine 
(U) and thymidine (T)) coupled to a phosphate group and which, when translated, encodes an 
amino acid. 

CONSTITUTIVELY ACTIVATED RECEPTOR shall mean a receptor subject to 
20 constitutive receptor activation. A constitutively activated receptor can be endogenous or non- 
endogenous. 

CONSTITUTIVE RECEPTOR ACTIVATION shall mean stabilization of a 
receptor in the active state by means other than binding of the receptor with its endogenous 
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Iigand or a chemical equivalent thereof 

: CONTACT or CONTACTING shall mean bringing at least two moieties together, 
whether in an in vitro system or an in vivo system. , 

DIRECtLYIDENT IDENTIFTED Jm relationship to die" 

5 phrase "candidate compound", shall mean the screening of a candidate compound against a 
constituti vely activated receptor, preferably a constitutively activated orphan receptor, and 
r most preferably against a constitutively activated G protein-coupled cell surface orphan 
receptor, and assessing the compound efficacy of such compound. This phrase is, under no 
circumstances, to be interpreted or understood to be encompassed by or to encompass the 

10 phrase "indirectly identifying" or "indirectly identified." 

ENDOGENOUS shall mean a material that a mammal naturally produces. 
ENDOGENOUS in reference to, for example and not limitation, the term "receptor," shall 
mean that which is naturally produced by a mammal (for example, and not limitation, a 
human) or a virus. By contrast, the term NON-ENDOGENOUS in this context shall mean 

15 that which is not naturally produced by a mammal (for example, and not limitation, a human) 
or a virus. For example, and not limitation, a receptor which is not constitutively active in its 
endogenous form, but when manipulated becomes constitutively active, is most preferably 
referred to herein as a "non-endogenous, constitutively activated receptor." Both terms can 
be utilized to describe both "in vivo" and "in vitro" systems. For example, and not limitation, 

20 in a screening approach, the endogenous or non-endogenous receptor may be in reference; to 
an in vitro screening system. As a further example and not limitation, where the genome of 
a mammal has been manipulated to include a non-endogenous constitutively activated 
receptor, screening of a Candidate compound by means of an in vivo system is viable. ... 
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G PROTEIN COUPLED RECEPTOR FUSION PROTEIN and GPCR FUSION 
PROTEIN, in the context of the invention disclosed herein, each mean a non-endogenous 
protein comprising an endogenous, constitutively activate GPCR or a non-endogenous, 
constitutively activated GPCR fused to at least one G protein, most preferably the alpha (a) 
5 subunit of such G protein (this being the subunit that binds GTP), with the G protein 
preferably being of the same type as the G protein that naturally couples with endogenous 
orphan GPCR. For example, and not limitation, in an endogenous state, Jf the G protein 
"Gsa M is the predominate G protein that couples with the GPCR, a GPCR Fusion Protein 
based upon the specific GPCR would be a non-endogenous protein comprising the GPCR 
1 0 fused to Gsct; in some circumstances, as will be set forth below, a non-predominant G protein 
can be fused to the GPCR. The G protein can be fused directly to the c-terminus of the 
, constitutively active GPCR or there may be spacers between the two. , 

HOST CELL shall mean a cell capable of having a Plasmid and/or Vector 
incorporated therein. In the case of a prokaryotic Host Cell, a Plasmid is typically replicated 
15 as a autonomous molecule as the Host Cell replicates (generally, the Plasmid is thereafter 
isolated for introduction into a eukaryotic Host Cell); in the case of a eukaryotic Host Cell, 
a Plasmid is integrated into the cellular DNA of the Host Cell such that when the eukaryotic . 
Host Cell replicates, the Plasmid replicates. Preferably, for the purposes of the invention 
disclosed herein, the Host Cell is eukaryotic, more preferably, mammalian, and most 
20 preferably selected from the group consisting of 293, 293T and COS-7 cells. 

INDIRECTLY IDENTIFYING or INDIRECTLY IDENTIFIED means the 
traditional approach to the drug discovery process involving identification of an endogenous 
ligand specific for an endogenous receptor, screening of candidate compounds against the 
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receptor for determination of those which interfere and/or compete with the ligand-receptor 
interaction, and assessing the efficacy; of the compound for affecting at least one second 

J 1 ,_ '.y r messenger pathwayassociated with the activated receptor. - - v- - J „. _ „ , -J ;. _ 

/'I INHIBIT or INHIBITING, in relationship to the term "response" shall mean that a 

5 response is decreased or prevented in the presence of a compound as opposed to in the 
absence of the compound. . / - 

INVERSE AGONISTS shall mean materials (e.g., ligand, candidate compound) 
which bind to either the endogenous form of the receptor or to the constitutively activated 
form of the receptor, and which inhibit the baseline intracellular response initiated by the 
10 active form of the receptor below the normal base level of activity which is observed in the 
absence of agonists or partial agonists, or decrease GTP binding to membranes. Preferably, 
the baseline intracellular response is inhibited in the presence of the inverse agonist by at least 
30%, more preferably by at least 50%, and most preferably by at least 75%, as compared with 
the baseline response in the absence of the inverse agonist' 
15 KNOWN RECEPTOR shall mean an endogenous receptor for which the endogenous 

ligand specific for that receptor has been identified. 
V LIGAND shall mean an endogenous, naturally occurring molecule specific for an 

endogenous, naturally occurring receptor. 

MUTANT or MUTATION in reference to an endogenous receptor's nucleic acid 
20 and/or amino acid sequence shall mean a specified change or changes; to such endogenous 
sequences such that a mutated form of an endogenous, non-constitutively activated receptor 
evidences constitutive activation of the receptor. ' In terms of equivalents to specific 
sequences, a subsequent mutated form of a human receptor is considered to be equivalent to 
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a first mutation of the human receptor if (a) the level of constitutive activation of the 
subsequent mutated form of a human receptor is substantially the same as that evidenced by 
the first mutation of the receptor; and (b) the percent sequence (amino acid and/or nucleic 
acid) homology between the subsequent mutated form of the receptor and the first mutation 
5 of the receptor is at least about 80%, more preferably at least about 90% and most preferably 
at least 95%. Ideally, and owing to the fact that the most preferred cassettes disclosed herein 
for achieving constitutive activation includes a single amino acid and/or codon change 
between the endogenous and the non-endogenous forms of the GPCR, the percent sequence 
homology should be at least 98%. . . 

10 NON-ORPHAN RECEPTOR shall mean an endogenous naturally occurring 

molecule specific for an endogenous naturally occurring ligand wherein the binding of a 
ligand to a receptor activates an intracellular signaling pathway. 

ORPHAN RECEPTOR shall mean an endogenous receptor for which the 
endogenous ligand specific for that receptor has not been identified or is not known. 

15 PHARMACEUTICAL COMPOSITION shall mean a composition comprising at 

least one active ingredient, whereby the composition, is amenable to investigation for a 
specified, efficacious outcome in a mammal (for example, and not limitation, a human). Those 
of ordinary skill in the art will understand and appreciate the techniques appropriate for 
determining whether an active ingredient has a desired efficacious outcome based upon the 

20 needs of the artisan. 

PLASMID shall mean the combination of a Vector and cDN A. Generally, a Plasmid 
is introduced into a Host Cell for the purposes of replication and/or expression of the cDNA 
as a protein. 
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STIMULATE or STIMULATING, in relationship to the term "response" shall mean 
. that'a response is increased in the presence of a compound as opposed to in the absence of the 

compound. : ] : ~- : ' ' .* ■ ' 

~- "VECTOR in reference to cDNA shall mean'a circular DNA capable of incorporating" 
5 k least one cDNA andcapable of incorporation into a Host Cell.* ^ - v - - - 

The order of the following sections is set forth for presentational efficiency and is not 
intended, nor should be construed, as a limitation on the disclosure or the claims to follow. 
A. Introduction : 

The traditional study of receptors has always proceeded from the a priori assumption 
10 (historically based) that the endogenous ligand must first be identified before discovery could 
proceed to find antagonists and other molecules that could affect the receptor. Even in cases 
where an antagonist might have been known first, the search immediately extended to looking 
forthe endogenous ligand. This mode ofthinking has persisted in receptor research even after 
the discovery of constitutively activated receptors. What has not been heretofore recognized 
15 is that it is the active state of the receptor that is most useful for discovering agonists, partial 
Agonists, and inverse agonists of the receptor. For those diseases which result from an overly 
active receptor or an under-active receptor, what is desired "in a therapeutic drug is a. 
compound which acts to diminish the active state of a receptor or enhance the activity of the 
' : . receptor, respectively; not necessarily a drug which is an antagonist to the endogenous ligand. 
20 This is because a compound that reduces or enhances the activity of the active receptor state 
heed not bind at the same site as the endogenous ligand. Thus, as taught by a method of this 
■ invention, any search for therapeutic compoUnds should start by screening compoiinds against 
_ the ligand-independent active state. 
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B. Identification of Human GPCRs 

. The efforts of the Human Genome project has led to the identification of a plethora of 
information regarding nucleic acid sequences located within the human genome; it has been 
the case in this endeavor that genetic sequence information has been made available without 

5 an understanding or recognition as to whether or not any particular genomic sequence does 
or may contain open-reading frame information that translate human proteins. Several 
methods of identifying nucleic acid sequences within the human genome are within the 
purview of those having ordinary skill in the art. For example, and not limitation, a variety 
of human GPCRs, disclosed herein, were discovered by reviewing the GenBank™ database, 

10 while other GPCRs were discovered by utilizing a nucleic acid sequence of a GRCR, 
previously sequenced, to conduct a BLAST™ search of the EST database. Table B, below, 
lists several endogenous GPCRs that we have discovered, along with a GPCR's respective 
homologous receptor. . . ' 

. ... TABLE B 



15 


Disclosed 
Human 
Orphan 
GPCRs : 


Accession 
Number 
Identified 


20 


hARE-3, 
hARE-4 


AL033379 
AC006087 




hARE-5 


AC006255 




hGPR27 
hARE-1 


AA775870 
AI090920 


25 


hARE-2 
hPPRl 
hG2A 


AA359504 
: H67224 
AA754702 



Open Reading 
Frame 
(Base Pairs) 



1,260 bp 
1,119 bp 

1,104 bp 

1,128 bp 
999 bp 

1,122 bp 
1,053 bp 
1,1 13 bp 



Per Cent 


Reference To . 


Homology 


Homologous 


To Designated 


GPCR 


GPCR 


(Accession No.) 


52.3% LPA-R . 


> U92642 


36% P2Y5 


AF000546 


32% Oryzias 


, D43633 


latipes 




43% 


D13626 


KIAA0001. 




53% GPR27 




39%EBI1 


L31581 


31%GPR4 


L36148 
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hRUP3 


AL035423 - : 


1,005 bp ; 


30% 

, fritz iUnUt£l*ol&r 


-2133653 




hRUP4 


AI307658 

■ ' ■ ! " . " ' ' ' • 


1 296 hn 


32% nNPGPR 
: 28% and 29 % 
.: Zebra fish Ya 
and Yb, 


/Vr\V_>* 1 X / O 

— and— — 






• - • '•' ■ ' - - :- 




respectively 






hRUPS .... 


.AC005849 


1,413 bp - 


.. . . 25% DEZ . 


Q99788 










. 23% FMLPR 


P21462 




hRUP6 


AC005871 


1,245 bp 


48% GPR66 


NP_006047 


• 5 ' • 


hRUP7 


AC007922 .... 


1,173 bp 


43%H3R 


AF140538 




hCHN3 


EST 36581 


1,113 bp 


53% GPR27 






, HCHN4 


AA804531 


1,077 bp 


32% thrombin 


4503637 




hCHN6 


EST 2134670 


; 1,503 bp 


'. ! '", 36% edg-1 


NP 001391 




hCHN8 


EST 764455 


1,029 bp 


47% 
KIAA0001 


. D13626 


10 


hCHN9 


EST 1541536 


1,077 bp 


41%LTB4R 


NM 000752 




hCHNIO 


EST 1365839 


1,055 bp 


35%P2Y 


NM_002563 



Receptor homology is useful in terms of gaining an appreciation of a role of the 
receptors within the human body; As the patent document progresses, we will disclose 
techniques for mutating these receptors to establish non-endogenous, constituti vely activated 
< 15 versions of these receptors. 

The techniques disclosed herein have also been applied to other human, orphan 
GPCRs known to the art, as will be apparent as the patent document progresses. 



C. Receptor Screening 

Screening candidate compounds against a non-endogenous, constitutively activated 
20 version of the human GPCRs disclosed herein allows for the direct identification of candidate 
' compounds which act at this cell surface receptor, without requiring use of the receptor's 



1 endogenous ligand. By determining areas within the body where the endogenous version of 
: human GPCRs disclosed herein is expressed and/or over-expressed, it is possible to determine 
■ related disease/disorder states which are associated with the expression and/or over-expression. 
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of the receptor; such an approach is disclosed in this patent document 

With respect to creation of a mutation that may evidence constitutive activation of the 
human GPCR disclosed herein is based upon the distance from the proline residue at which 
is presumed to be located within TM6 of the GPCR; this algorithmic technique is disclosed 
5. in co-pending and commonly assigned patent document U.S. Serial Number 09/170,496, 
incorporated herein by reference. The algorithmic technique is not predicated upon traditional 
sequence "alignment" but rather a specified distance from the aforementioned TM6 proline 
residue. By mutating the amino acid residue located 1 6 amino acid residues from this residue 
{presumably located in the IC3 region of the receptor) to, most preferably, a lysine residue, 

10 such activation may be obtained. Other amino acid residues may be useful in the mutation 
at this position to achieve this objective. : 
D. . Disease/Disorder Identification and/or Selection 

As will be set forth in greater detail below, most preferably inverse agonists to the 
non-endogenous, constitutively activated GPCR can be identified by the methodologies of this 

15 invention. Such inverse agonists are ideal candidates as lead compounds in drug discovery 
programs for treating diseases related to this receptor. Because of the ability to directly 
identify inverse agonists to the GPCR, thereby allowing for the development of 
pharmaceutical compositions, a search for diseases and disorders associated with the GPCR 
is relevant. For example, scanning both diseased and normal tissue samples for the presence 

20 of the GPCR now becomes more than an academic exercise or one. which might be pursued 
along the path of identifying an endogenous ligand to the specific GPCR. Tissue scans can 
be conducted across a broad range of healthy and diseased tissues. Such tissue scans provide 
a preferred first step in associating a specific receptor with a disease and/or disorder. See, for 
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example, co-pending application (docket number ARE-Q050) for exemplary dot-blot and RT- 
PCR results of several of the GPCRs disclosed herein. 

; Preferably, the DNA sequence of the human GPCR is used to make a probe for (a) 
4 dot-blot analysis against tissue-mRN A, and/or (b) RT-PCR identification of the expression 
; 5 ; of the receptor in tissue samples. The presence of a receptor in a tissue source, or a 

diseased tissue, or the presence of the receptor at elevated concentrations in diseased tissue 
compared to a normal tissue, can be preferably utilized to identify a correlation with a 
treatment regimen, including but not limited to, a disease associated with that disease. 
Receptors can equally well be localized to regions of organs by this technique. Based on 
10 the known functions of the specific tissues to which the receptor is localized, the putative 
functional role of the receptor can be deduced. 
E. Screening of Candidate Compounds 

•' 1. Generic GPCR screening assay techniques 

When a G protein receptor becomes constitutively active, it binds to a G protein (e.g., 
15 ■ Gq, Gs, Gi, Gz, Go) and stimulates the binding of GTP to the G protein. The G protein then 
acts as a GTPase and slowly hydrolyzes the GTP to GDP, whereby the receptor, under normal 
conditions, becomes deactivated. However, constitutively activated receptors continue to 
exchange GDP to GTP. A non-hydrolyzable analog of GTP, [ 35 S]GTfyS, can be used to 
monitor enhanced binding to membranes which express constitutively activated receptors. 
20 It is reported that [ 35 S]GTPyS can be used to monitor G protein coupling to membranes in the 
absence and presence of ligand. Ah example of this monitoring, among other examples well- 
known and available to those in the art, was reported by Traynor and Nahorski in 1995. The 
preferred use of this assay system is for initial screening of candidate compounds because the 
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system is generically applicable to all G protein-coupled receptors regardless of the particular 
G protein that interacts with the intracellular domain of the receptor. 
2. Specific GPCR screening assay techniques 

Once candidate compounds are identified using the "generic" G protein-coupled 
5 receptor assay (/. e., an assay to select compounds that are agonists, partial agonists, or inverse 
agonists), further screening to confirm that the compounds have interacted at the receptor site 
is preferred. For example, a compound identified by the "generic" assay may not bind to the 
receptor, but may instead merely "uncouple" the G protein from the intracellular domain. 

<l Gsy Gz and Gi. 

10 Gs stimulates the enzyme adenylyl cyclase. Gi (and Gz and Go), on the other hand, 

inhibit this enzyme. Adenylyl cyclase , catalyzes the. conversion of ATP to cAMP; thus, 
constitutively activated GPCRs that couple the'Gs protein are associated with increased 
cellular levels of cAMP. On the other hand, constitutively activated GPCRs that couple Gi 
(or Gz, Go) protein are associated with decreased cellular levels of cAMP. See, generally, 

15 "Indirect Mechanisms of Synaptic Transmission," Chpt. 8, From Neuron To Brain (3 rd Ed.) 
Nichols, J.G. et al eds. Sinauer Associates, Inc. (1992). Thus, assays thatdetect cAMP can 
be utilized to determine if a candidate compound is, e.g., an inverse agonist to the receptor 
(i.e., such a compound would decrease the levels of cAMP). A variety of approaches known 
in the art for measuring cAMP can be utilized; a most preferred approach relies upon the use 

20 of anti-cAMP antibodies in an ELISA-based format. Another type of assay that can be 
utilized is a whole cell second messenger reporter system assay. Promoters on genes drive 
the expression of the proteins that a particular gene encodes. Cyclic AMP drives gene 
expression by promoting the, binding of a cAMP-responsive DNA binding protein or 
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transcription factor (CREB) that then binds to the promoter at specific sites called cAMP 
"response elements and drives the expression of the gene. Reporter systems can be constructed 
which have a promoter containing multiple cAlvlP response elements before the reporter gene, 
e.g., p-galactosidase or luciferase. Thus^a constitutively activated Gs-linked receptor causes 
5 the accumulation of cAMP that then activates the gene and expression of the reporter protein. 
The reporter protein such as p-galactosidase or luciferase can then be detected using standard 
biochemical assays (Chen et al. 1995). 
bi Go and Gq. 

10 Gq and Go are associated with activation of the enzyme phospholipase C, which in 

turn hydrolyzes the phospholipid PIP 2 , releasing; two intracellular messengers: 
: - diacycloglycerol (DAG) and inistol 1,4,5-triphoisphate (IP 3 ). Increased accumulation of IP 3 
is associated with activation of Gq- and Go-associated receptors. See, generally, "Indirect 
Mechanisms of Synaptic Transmission," Chpt. 8, From Neuron To Brain (3 rd Ed.) Nichols, 

15 J.G. et al eds. Sinauer Associates, Inc. (1992). Assays that detect IP 3 accumulation can be 
utilized to determine if a candidate compound is, e.g., an inverse agonist to a Gq- or Go- 
associatecl receptor (i.e., such a compound would decrease the levels of IP 3 ). Gq-associated 
receptors can also been examined using an API reporter assay in that Gq-dependent 
phospholipase C causes activation of genes containing API elements; thus, activated Gq- 

20 associated receptors will evidence an increase in the expression of such genes, whereby 
inverse agonists thereto will evidence a decrease in such expression, and agonists will 
evidence an increase in such expression. Commercially available assays for such detection 
.are available.; 
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The use of an endogenous, constitutively activate orphan GPCR or a non-endogenous, 
constitutively activated orphan GPCR, for use in screening of candidate compounds for the 
direct identification of inverse agonists, agonists and partial agonists provide an interesting 
5 screening challenge in that, by definition, the receptor is active even in the absence of an 
endogenous ligand bound; thereto. Thus, in order to differentiate between, e.g., the non- . 
endogenous receptor in the presence of a candidate compound and the non-endogenous 
receptor in the absence of that compound, with an aim of such a differentiation to allow for 
an understanding as to whether such compound may be an inverse agonist, agonist, partial 

1 0 agonist or have no affect on such a receptor, it is preferred that an approach be utilized that 
can enhance such differentiation. A preferred approach is the use of a GPCR Fusion Protein. 

Generally, once it is determined that a non-endogenous orphan GPCR has been 
constitutively activated using the assay techniques set forth above (as well as others), it is 
possible to determine the predominant G protein that couples with the endogenous GPCR. 

15 Coupling of the G protein to the GPCR provides a signaling pathway that can be assessed. 
Because it is most preferred that screening take place by use of a mammalian expression 
system, such a system will be expected to have endogenous G protein therein. Thus, by 
definition, in such a system* the non-endogenous, constitutively activated orphan GPCR will 
continuously signal. In this regard, it is preferred that this signal be enhanced such that in the 

20 presence of, e.g., an inverse agonist to the receptor, it is more likely that it will be able to more 
readily differentiate, particularly in the context of screening, between the receptor when it is 
contacted with the inverse agonist. 

The GPCR Fusion Protein is intended to enhance the efficacy of G protein coupling 
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with the non-endogenous GPCR: The GPCR Fusion Protein is preferred for screening with 
a non-endogenous, constitutively activated GPCR because such an approach increases the 
signal that is most preferably utilized in such screening techniques. This is important in 
facilitating a significant "signal to -noise" ratio; such a significant ratio is import preferred for 
the screening of candidate compounds as disclosed herein. . ' . 

^ . The construction of a construct useful for expression of a GPCR Fusion Protein is 
within the purview of those having ordinary skill in the art. Commercially available 
expression vectors and systems offer a variety of approaches that can fit the particular needs 
of an investigator. The criteria of importance for such a GPCR Fusion Protein construct is 
that the endogenous GPCR sequence and the G protein sequence both be in-frame (preferably, 
the sequence for the endogenous GPCR is upstream of the G protein sequence) and that the 
"stop" codon of the GPCR must be deleted or replaced such that upon expression of the 
GPCR, the G protein can also be expressed. The GPCR can be linked directly to the G 
protein, or there can be spacer residues between the two (preferably, no more than about 12, 
although this number can be readily ascertained by one of ordinary skill in the art). We have 
:a preference (based upon convenience) of use of a spacer in that some restriction sites that are ' 
not used will, effectively, upon expression, become a spacer. Most preferably, the G protein 
that couples to the non-endogenous GPCR will have been identified prior to the creation of 
the GPCR Fusion Protein construct. Because there are only a few G proteins that have been 
identified, it is preferred that a construct comprising the sequence of the G protein (/.e. ( a 
universal G protein construct) be available for insertion of an endogenous GPCR sequence 
therein; this provides for efficiency in the context of large-scale screening of a variety of 
different endogenous GPCRs having different sequences. 
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. . * - As noted above, constitutively activated GPCRs that couple to Gi, Gz and Go are 
* expected to inhibit the formation of cAMP making assays based upon these types of GPCRs 
challenging (i.e., the cAMP signal decreases upon activation thus making the direct 
identification of, e.g. inverse agonists (which would further decrease this signal), interesting). 

5 As will be disclosed herein, we have ascertained that for these types of receptors, it is possible 
to create a GPCR Fusion Protein that is not based upon the endogenous GPCR's endogenous 
G protein, in an effort to establish a viable cyclase-based assay. Thus, for example, a Gz 
coupled receptor such as H9, a GPCR Fusion Protein can be established that utilizes a Gs 
. fusion protein - we believe that such a fusion construct, upon expression, "drives" or "forces" 

.10 the non-endogenous GPCR to couple with, e.g., Gs rather than the "natural" Gz protein, such 
that a cyclase-based assay can be established. Thus, for Gi, Gz and Go coupled receptors, we 
prefer that that when a GPCR Fusion Protein is used and the assay is based upon detection of 
:. adenyl cyclase activity, that the fusion construct be established with Gs (or an equivalent G 
protein that stimulates the formation of the enzyme adenylyl cyclase). 

15 F. Medicinal Chemistry 

Generally, but not always, direct identification of candidate compounds is preferably 
conducted in conjunction with compounds generated via combinatorial chemistry techniques, 
whereby thousands of compounds are randomly prepared for such analysis. Generally, the 
results.of such screening will be compounds having unique core structures; thereafter, these 

20 compounds are preferably subjected to additional chemical modification around a preferred 
core structure(s) to further enhance the medicinal properties thereof. Such techniques are 
known to those in the art and will not be addressed in detail in this patent document. 
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G. Pharmaceutical compositions ......... 

: Candidate compounds selected for further development can be formulated into 

'. pharmaceutical compositions using techniques well known to those in the art. Suitable 
~ n 1 phannaceutically-acceptabie carriers are available to those in the art; for; example, see _ 
V A . Remin^^ 

: '.".*' eds.) ']''-'•■■'. ■' " -V ' ; _' ":r' 

H. Other Utility -^VX'"' 

Although a preferred use of the non-endogenous versions the human GPCRs disclosed 
herein may be for the direct identification of candidate compounds as inverse agonists, 
10 agonists or partial agonists (preferably for use as pharmaceutical agents), these versions of 
human GPCRs can also be utilized in research settings. For example, m v;7ro and m vivo 
systems incorporating GPCRs can be utilized to further elucidate and understand the roles 
these receptors play in the human conditioni both normal and diseased, as well as 
understanding the role of constitutive activation as it applies to understanding the signaling 
15 cascade. The value in non-endogenous human GPCRs is that their utility as a research tool 
is enhanced in that, because of their unique feamres, non-endogenous human GPCRs c^ 
used to understand the role of these receptors in the human body before the endogenous 
ligand therefor is identified. Other uses of the disclosed receptors will' become apparent to ' 
. . those in the art based upon, inter alia, a review of this patent document. 
■' 20 ' . EXAMPLES 

The following examples are presented for purposes of elucidation, arid riot limitation, of 
the present invention. While specific nucleic acid and amino acid sequences are disclosed 
. herein, those of ordinary skill in the art are credited with the ability to make minor 
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modifications to these sequences while achieving the same or substantially similar results 
reported below. The traditional approach to application or understanding of sequence 
cassettes from one sequence to another (e.g. from rat receptor to human receptor or from 
human receptor A to human receptor B) is generally. predicated upon sequence alignment 

5 techniques whereby the sequences are aligned in an effort to determine areas of commonality. 
The mutational approach disclosed herein does not rely upon this approach but is instead 
based upon an algorithmic approach and a positional distance from a conserved proline 
residue located within the TM6 region of human GPCRs. Once this approach is secured, 
- those in the art are credited with the ability to make minor modifications thereto to achieve 

10 substantially the same results (/.e., constitutive activation) disclosed herein. Such modified 

approaches are considered within the purview of this disclosure 

Example 1 ^ 
Endogenous Human Gpcrs 

1. Identification of Human GPCRs 

15 Certain of the disclosed endogenous human GPCRs were identified based upon a 

review of the GenBank™ database information. While searching the database, the following 
cDNA clones were identified as evidenced below (Table C). 

TABLE C 



Disclosed 
Human 
Orphan 
GPCRs 


Accession 
Number 


Complete DN A 
Sequence 
(Base Pairs) 


Open Reading 
Frame 
. (Base Pairs) 


Nucleic 
Acid 

SEQ.ID. 
NO. 


Amino 
Acid 
SEQ.ID. 
NO. 


hARE-3 


AL033379 


11 1,389 bp 


1,260 bp 


1 


2 


hARE-4 


AC006087 


226,925 bp 


1,1 19 bp 


3 


4 


hARE-5. . 


AC006255 ■ 


. 127,605 bp 


.' 1,104 bp 


: 5 .■■ . 


6 


hRUP3 


, AL035423 


140,094 bp • 


.1,005 bp 


7 . 


8 
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• • - hRUPS - : 


.. : AC005849 


' 169,144 bp; 


1,413 bp 


- .... 9 . . - 


1 A 




! hRUP6 


. AC005871 


218,807 bp 


: V .1,245 bp . 


. ", u 


12 




\ hRUP7 


AC007922 


158,858 bp 


1,173 bp 




14 

















Other disclosed endogenous human GPCRs were identified by conducting a BLAST?" 
5 " search of EST database (dbestf using the followihg"EST clones as" query' sequences. The 
following EST clones identified were then used as a probe to screen a human genomic library 
(Table D). ;', ■ . . .■ .V ' ; ';■ ' ;'V-- .." ■ -.' : '';v- v .; .' .' ■ ■" 

'TABLED... ':;:[■' - : ;. 

Disclosed Query EST Clo 

10 



15 



20 



Disclosed 


Query 


EST Clone/ 


Open 


Nucleic Acid 


; Amino Acid 


Human 


(Sequence) 


Accession No. 


Reading 


SEQ.1D.NO. 


SEQ.1D.NO. 


Orphan 




Identified . . 


/Frame 






GPCRs 






(Base Pairs) 






hGPCR27 


■ Mouse ■ 


AA775870 


. .1,125 bp 


17 


18 




, GPCR27 










hARE-1 


. TDAG 


1689643 


999 bp 


19 


20 " 






A1090920 








hARE-2 


GPCR27 


68530 


1,122 bp 




22 \ 






AA359504 








hPPRl 


Bovine 


238667 


1,053 bp 


. 23 


"' 24 • 




PPR1 


H67224 








hG2A 


Mouse 


See Example 2(a), 


1,113 bp 


25-: : :. 


26 




1179426 


below 








hCHN3 


N.A. 


. EST 36581 


1.113 bp 


27 -■ . 


28 






(full length) 








hCHN4 


TDAG 


1184934 


1,077 bp 


' . 29 


" 30 






AA8Q4531 , 








hCHN6 


N.A. 


EST 2134670 


1,503 bp 


■ ■ 31 


32 






(full length) 








hCHN8 


K1AA0001 


EST 764455 


; 1,029 bp 


33 


' 34 


hCHN 9 


1365839 


EST 1541536 


1.077 bp 


35 


,. • 36 • 


hCHNiO 


Mouse EST 


Human 1365839 


1,005 bp 


3 7 ■ 


38 




1365839 










HRUP4 


N.A; 


AI307658 


1,296 bp .. 


39 


40 



25 



N.A. .« "not applicable' 9 . 



2. Full Length Cloning 
a. Human G2A 

Mouse EST clone 1 179426 was used to obtain a human genomic clone containing all 
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but three amino acid G2A coding sequences. The 5'of this coding sequence was obtained by 
using 5 'RACE, and the template for PCR was Clontech' s Human SpleenMarathon-Ready™ 
cDNA. The disclosed human G2A was amplified by PCR using the G2A cDNA specific 
primers for the first and second round PCR as shown in SEQ.ID.NO.: 41 and SEQ.ID.NO.:42 
5 as follows: ... 

S'-CTGTGTACAGCAGTTCGCAGAGTGO' (SEQ.ID.NO.; 41; G round PCR) 
S'-GAGTGCCAGGCAGAGCAGGTAGAC-S' (SEQ.ID.NO.: 42; second round PCR). 
PCR was performed using Advantage GC Polymerase Kit (Clontech; manufacturing 
instructions will be followed), at 94°C for 30 sec followed by 5 cycles of 94°C for 5 sec and 

10 72°C for 4 min; and 30 cycles of 94° for 5 sec and 70° for 4 min. An approximate 1 .3 Kb 
PCR fragment was purified from agarose gel, digested with Hind III and Xba I and cloned into 
the expression vector pRC/CMV2 (Invitrogen). The cloned-insert was sequenced using the 
T7 Sequenase™ kit (USB Amersham; manufacturer instructions followed) and the sequence 
was compared with the presented sequence. Expression of the human G2A was detected by 

15 probing an RNA dot blot (Clontech; manufacturer instructions followed) with the P 32 -labeled 
fragment. 

b. CHN9 

Sequencing of the EST clone 1 541536 showed CHN9 to be a partial cDNA clone . 
having only an initiation codon; z.e., the termination codon was missing. When CHN9 
20 was used to blast against data base (nr), the 3' sequence of CHN9 was 100% homologous 
to the 5' untranslated region of the leukotriene B4 receptor cDNA, which contained a 
termination codon in the frame with CHN9 coding sequence. To determine whether the 5' 
untranslated region of LTB4R cDNA was the 3 ' sequence of CHN9, PCR was performed 
using primers based upon the 5' sequence flanking the initiation codon found in CHN9 and 
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the 3' sequence around the termination codon found in the LTB4R 5' untranslated region. 
The 5' primer sequence utilized was as follows: 

5 -CCCGAATTCCTGCTTGCTCGCAGCTTOGCCC-3' (SEQ.ID.NO.: 43; sense) and • 

5'-tGfGGATCCTGlCTGTC^ ' ""~ ~ " 

PCR was .performed using thymus cDNA as a templatearid rTth polymerase (Perkin Elmer) , 

with the buffer system provided by the manufacturer, 0.25 uM. of each primer, and 0.2 raM 

of each 4 nucleotides. The cycle condition was 30 cycles of 94°C for 1 mini, 65°C for lmin 

and 72 °C for 1 min and 10 sec. A 1.1 kb fragment consistent with the predicted size was 

obtained from PCR This PGR fragment was subcloned into pCMV {see below) and 

sequenced (see, SEQ.ID.NO:: 35). 

. • ' v ; c. rup4 .'. 

The full length RUP4 was cloned by RT-PCR with human brain cDNA (Clontech) as 
templates: 

5'-TCACAATGCTAGGTGTGGTC-3' (SEQ.ID.NO.: 45; sense) and J . 

5 ' -TGCATAG ACAATGGG ATTACAG-3 ' (SEQ.ID.NO.: 46; antisense). 

PGR was performed using TaqPlus Precision™ polymerase (Stratagene; manufacturing 
instructions followed) by the following cycles: 94°C for 2 min;94°C 30 sec; 55 °C for 30 sec, 
72°C for 45 sec; and 72°C for 10 min. Cycles 2 through 4 were repeated 30 times. 

The PCR products were separated on a 1% agarose gel and a 500 bp PCR fragment 
was isolated and cloned into the pCRII-TOPO™ vector (Invitrogen) and sequenced using the 
' T7 DNA Sequenase™ kit ( Amsham) and the SP6/T7 primers (Stratagene). ' Sequence analysis 
revealed that the PCR fragment was indeed an alternatively spliced form of AI307658 having 
a continuous open reading frame with similarity to other GPCRs. The completed sequence 
of this PCR fragment was as follows: 
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51-TCAGAATGCTAGGTGTGGTCTGGCTGGTGGCAGTCATCGTAGGATCACCCATGTGGCAC 
GTGCAACAACTTGAGAfCAAATATGACTTCCTATATGAAAAGGAACACATCTGCTGCTTAAGA 
GTGGACCAGCCCTGTGCACCAGAAGATCTACACCACCTTCATCCTTGTCATCCTCTTCCTCCTGC 
CTCTTATGGTGATGCTTATTCTGTACGTAAAATTGGTTATGAACTTTGGATAAAGAAAAGAGTT 
5 GGGGATGGTTCAGTGCTTCGAACTATTCATGGAAAAGAAATGTCCAAAATAGCCAGGAAGAAG 
AAACGAGCTGTCATTATGATGGTGACAGTGGTGGCTCTCTTTGCTGTGTGCTGGGCACCATTCC 
ATGTTGTCCATATGATGATTGAATACAGTAATTTTGAAAAGGAATATGATGATGTCACAATCAA 
GATGATTTTTGCTATCGTGCAAATTATTGGATTTTCCAACTCCATCTGTAATCCCATTGTCTATGCA- . 
3'(SEQ.ID.NO.:47) 

10 Based on the above sequence, two sense oligonucleotide primer sets: 
5-CTGCTTAGAAGAGTGGACCAG-3' (ISEQ.ID.NO.: 48; oligo 1), 
5'-CTGTGCACCAGAAGATCTACAC-3' (SEQ.IDNO.: 49; oligo 2) and 

two antisense oligonucleotide primer sets: 
5'-CAAGGATGAAGGTGGTGTAGA-3' (SEQ.ID.NO.: 50; oligo 3) 
15 5 , -GTGTAGATCTTCTGGTGCACAGG-3 , (SEQ.ID.NO.: 51; oligo 4) , 

were used for 3'- and 5 ? -RACE PCR with a human brain Marathon-Ready™ cDNA 

(Clontech, . Cat# 7400-1) as template, according to manufacture's instructions. DNA 
fragments generated by the RACE PCR were cloned into the pCRII-TOPO™ vector 
(Invitrogen) and sequenced using the SP6/T7 primers (Stratagene) and some internal primers. 

20 The 3' RACE product contained a poly(A) tail and a completed open reading frame ending 
at a TAA stop codon. The 5' RACE product contained an incomplete 5' end; i.e., the ATG 
initiation codon was not present. 

Based on the new 5' sequence, oligo 3 and the following primer: 
5'-GCAATGCAGGTCATAGTGAGC -3" (SEQ.ID.NO.: 52; oligo 5) . 

25 were used for the second round of 5' race PCR and the PCR products were analyzed as above. 

A third round of 5' race PCR was carried out utilizing antisense primers: 
5VTGGAGCATGGTGACGGGAATGCAGAAG-3' (SEQ.ID.NO : 53: oligo 6) and 
5'-GTGATGAGCAGGTCACTGAGCGCCAAG-3' (SEQ.ID.NO.: 54; oligo7), 
The sequence of the 5' RACE PCR products revealed the presence of the initiation codon 
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ATG, and further found of '5 ' race PCR did not generate any more 5' sequence. The 
completed 5' sequence was confirmed by RT : PCR using sense primer < 
5'-GCAATGCAGGCGCTTAACATTAC-3' (SEQ.ID.NO,: 55; oligo 8) 

and oligo 4~as primers "and sequence analysis bf"the~650 bp PCR pfoduct generated from 
5 human brain and heart cDNA templates (Clontech, Cat# 7404-1). The completed 3' sequence 
was confirmed by RT-PCR using oligo 2 and the following antisense primer: 
5 , -TTGGGTTACAATCTGAAGGGCA-3'(SEQ.lD:NO.:56;oligo9) ,,v 
and sequence analysis of the 670 bp PCR product generated from human brain and heart 

cDNA templates. (Clontech, Cat# 7404-1). : ^ ; 

10 V-' ' d. RUP5 ' -,;} "' ; : •'. 

The full length RUP5 was cloned by RT-PCR using a sense primer upstream from 
ATG, the initiation codon (SEQ.ID.NO.:57), and an antisense primer containing TCA as the 
stop codon (SEQ.ID.NO.:58), which had the following sequences: 
5 '-ACTCCGTGTGCAGCAGG ACTCTG-3 ' (SEQ.ID.NO.: 57) 
15 5 '-TGCGTGTTCCTGG ACCCTCACGTG-3 ' (SEQ.ID.NO!: 58) 
]: and human peripheral leukocyte cDNA, (Clontech) as a template. Advantage™ cDNA 
polymerase (Clontech) was used for the amplification in a 50ul reaction by the following cycle 
with step 2 through step 4 repeated 30 times: 94°C for 30 sec; 94° for 15 sec; 69° for 40 sec; 
72 °C for 3 min; and 72°C fro 6 min. A 1.4kb PCR fragment was isolated and cloned with 
20, the pCRII-TOPO™ vector (Invitrogen). and completely sequenced using the T7 DNA 
Sequenase™ kit (Amsham). See, SEQ.ID.NO.: ?. • . ' . 

e. rup6 • .• ■ 

The full length RUP6 was cloned by. RT-PCR using primers: 
5'-CAGGCCTTGGATTTTAATGTCAGGGATGG-3* (SEQ.ID.NO.: 59) and ~. 
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S'-GGAGAGTCAGCTCTGAAAGAATTCAGGO' (SEQ.ID.NO.: 60); 

and human thymus Marathon-Ready™ cDNA (Clontech) as a template. Advantage cDNA 
polymerase (Clontech, according to manufacturers instructions) was used for the 
amplification in a 50ul reaction by the following cycle: 94°C for 30sec; 94 °C for 5 sec; 66 °C 

5 for 40sec; 72 °C for 2.5 sec and 72°C for 7 min. Cycles 2 through 4 were repeated 30 times. 
A 1.3 Kb PCR fragment was isolated and cloned into the pCRII-TOPO™ vector (Invitrogen) 
and completely sequenced (see, SEQ.ID.NO.: 11) using the ABI Big Dye Terminator™ kit 
(P.E. Biosystem). 

. f. RUP7 .. 

10 The full length RUP7 was cloned by RT-PCR using primers: 

5'-TGATGTGATGCCAGATACTAATAGCAC-3' (SEQ.ID.NO.: 61; sense) and 
5'-CCTGATTCATTTAGGTGAGATTGAGAC-3' (SEQ.ID.NO.: 62; amisense) 3 .. 

- and human peripheral leukocyte cDNA (Clontech) as a template. Advantage™ cDNA 
polymerase (Clontech) was used for the amplification in a 50 ul reaction by the following 

15 cycle with step 2 to step 4 repeated 30 times: 94°C for 2 minutes; 94°C for 1 5 seconds; 60°C 
for 20 seconds; 72°C for 2 minutes; 72°C for 10 minutes. A 1.25 Kb PCR fragment was 
isolated and cloned into the pCRII-TOPO™ vector (Invitrogen) and completely sequenced 
using the ABI Big Dye Terminator™ kit (P.E. Biosystem). See, SEQ.ID.NO.: 13. 
3. Angiotensin II Type 1 Receptor ("ATI") 

20 The endogenous human angiotensin II type 1 receptor ("ATI ") was obtained by PCR 

using genomic DNA as template and rTth polymerase (Perkin Elmer) with the buffer system 
provided by the manufacturer, 0.25 uM of each primer, and 0.2 mM of each 4 nucleotides. 
The cycle condition was 30 cycles of 94°C for 1 min, 55°C'for lmin and 72 °C for 1.5 min. 
The 5' PCR primer contains a Hiridill site with'the sequence: 
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5 , -CCCAAGCTTCGCCAGGTGTATTTGAT-3 i (SEQ.ID.NO.: 63) ; ;'; - - ; 
and the 3 ' primer contains a BamHI site with the following sequence: ; 
5'-GTTGGATCCACATAATGCATTTTCTC-3' (SEQ.ID.NO.: 64). 

The resulting 13 kb,PCR.fragment was digested with Hindlll and BamHI and cloned into 
5 Hindlll^BamHI site of pCMV expression vector. The cDNA clone was fully sequenced. 
Nucleic acid (SEQ.ID.NO.: 65) and amino acid (SEQ.ID.NO.: 66) sequences for human ATI 
were thereafter determined and verified/ : 

4. GPR38 v V .)..■ 

To obtain GPR38, PCR was performed by combining two PCR fragments, using 
10 human genomic cDNA as template and rTthpoymerase (Perkin Elmer) with the buffer system 
provided by the manufacturer, 0.25uM of each primer, and 0.2 mM of each 4 nucleotides, 
the cycle condition for each PCR reaction was 30 cycles of 94° C for 1 min. 62°C for lmin 
and 72°C for 2 min. ) • . . 

The first fragment was amplified with the 5' PCR primer that contained an end site 

15 with the following sequence: 

:5SACCATGGGCAGCCCCTGGAACGGCAGC-3' (SEQ.ID.NO.:67) 

and a 3' primer having the following sequence: , . : * 
.. 5'-AGAACCACCACCAGGAGGACGCGGACGGTCTGCCGGTGG-3 , (SEQ.ID .NO.:68). 
• The second PCR fragment was amplified with a 5' primer having the following sequence: 
20 5'-GTCCGCGTCCTGCTGGTGGTGGTTCTGGCATTTATAATT-3 , (SEQ.ID.NO.: 69) 

and a 3 ' primer that contained a BamHI site and having the following sequence: 

S'-GCIGGATGCTTATCCCATCGTCTTCACGTrAGCTS' (SEQ.ID.NO.: .70). 

The two fragments were used as templates to amplify GPR38, using SEQ.ID.NO.: 67 and 

SEQ.ID.NO.: 70 as primers (using the above-noted cycle conditions). The resulting 1.44kb 
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PCR fragment was digested with BamHI and cloned into Blunt-BamHI site of pCMV 

expression vector. 

5. MC4 1 

To obtain MC4, PCR was performed using human genomic cDNA as template and 
5 rTth poymerase (Perkin Elmer) with the buffer system provided by the manufacturer, 0.25uM 

of each primer, and 0.2 mM of each 4 nucleotides. The cycle condition for each PCR reaction 

was 30 cycles of 94°C for 1 min, 54°C for Imin and 72°C for 1.5 min. 

The 5' PCR contained an EcoRl site with the sequence: 

S'-CTGGAATTCTCCTGCCAGCATGGTGA-S' (SEQ.ID.NO.: 71) 
10 ; and the 3' primer contained a BamHI site with the sequence: 

S'-GCAGGATCCTATATTGCGTGCTCTGTCCCC'-S (SEQ.ID.NO.: 72). 

The 1.0 kb PCR fragment was digest with EcoRI and BamHI and cloned into EcoRI-BamHI 
site of pCMV expression vector. Nucleic acid (SEQ.ID.NO.: 73) and amino acid 
(SEQ.ID.NO.: 74) sequences for human MC4 were thereafter determined. 

15 6 CCKB 

To obtain CCKB, PCR was performed using human stomach cDNA as template and 
rTth poymerase (Perkin Elmer) with the buffer system provided by the manufacturer, 0.25uM 
of each primer, and 0.2 mM of each 4 nucleotides. The cycle condition for each PCR reaction 
was 30 cycles of 94°C for 1 min, 65°C for Imin and 72°C for 1 min and 30 sec. 

20 The 5' PCR contained a Hindlll site with the sequence: 

5*-CCGAAGCTTCGAGCTGAGTAAGGCGGCGGGCT-3' (SEQ.ID.NO.: 75) ^ 
and the 3' primer contained an EcoRI site with the sequence: 
5'-GTGGAATTCATTTGCCCTGCCTCAACCCCCA-3 (SEQ.ID.NO.: 76). 

The resulting 1.44 kb PCR fragment was digest with Hindlll and EcoRI and cloned into 
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Hindlll-EcoRI site of pCMV expression vector. Nucleic acid (SEQ.ID.NO.: 77) and amino 
acid (SEQ.lb.NO.: 78) sequences for human CCKB were thereafter determined. 

7. TDAG8 :/" .''/" '.' ■''/,'•' /V'-.' 

• To-obtain TDAG8, PGR was performed -using genomic. DN A as. template and rTth 

5 polymerase (Perkin Elmer) with Ae buffer system provided by the manufacturer, 0,25 uM of 
each primer, and 0.2 mM of each. 4 nucleotides. The cycle condition was 30 cycles of 94°C 
for 1 min, 56°C for lmin and 72 °C for 1 rhin and 20 sec. The 5' PCR primer contained a 
Hindlll site with the following sequence: , ' - 

5'-TGCAAGCTTAAAAAGGAAAAAATGAACAGC-3' (SEQ.1D.NO.: 79) 
10 and the 3' primer contained a BamHI site with the following sequence: 
5'-TAAGGATCCCTTCCCTTCAAAACATCCTTG -3' (SEQ.ID.NO.: 80). : 
The resulting 1.1 kb PCR fragment was digested with Hindlll and BamHI and cloned into 
Hindlll-BamHI site ofpCMV expression vector. Three resulting clones sequenced contained 
three potential polymorphisms involving changes of amino acid 43 from Pro to Ala, amino 
15 acid 97 from Lys to Asn and amino acid 130 from He to Phe. Nucleic acid (SEQ.ID.NO.: 81) 
' and amino acid (SEQ.ID.NO.: 82) sequences for human TD AG 8 were thereafter determined. 

V 8. H9 \ • ' . ■ \" 

To obtain H9, PCR was performed using pituitary cDNA as template and rTth 
polymerase (Perkin Elmer) with the buffer system provided by the manufacturer, 0.25 of 
20 each primer, and 0.2 mM of each 4 nucleotides. The cycle condition was 30 cycles of 94°C 
for 1 min, 62°C for 1 min and 72°C for 2 min. The 5' PCR primer contained a Hindlll site 
with the following sequence: . 

S'-GGAAAGCTTAACGATCCCCAGGAGCAACAT-S' (SEQ.ID.NO.: 15) : 
and the 3' primer contained a BamHI site with the following sequence: 
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5*-CTGGGATCCTACGAGAGCATTTTTCACACAGOV(SEQ.ID.NO.:16). 
The resulting 1.9 kb PCR fragment was digested with Hindlll and BamHI and cloned into 
Hindlli-BamHI site of pCMV expression vector. H9 contained three potential polymorphisms 
involving changes of amino acid P320S, S493N and amino, acid G448A. Nucleic acid 
5 (SEQ.ID.NO.: 139) and amino acid (SEQ.ID.NO.: 140) sequences for human H9 were 
thereafter determined and verified. 
Example 2 

Preparation of Non-Endogenous, Constitutively Activated Gpcrs 

Those skilled in the art are credited with the ability to select techniques for 

1 0 mutation of a nucleic acid sequence. Presented below are approaches utilized to create 
non-endogenous versions of several of the human GPCRs disclosed above. The mutations 
disclosed below are based upon an algorithmic approach whereby the 1 6 th amino acid 
(located in the IC3 region of the GPGR) from a conserved proline residue (located in the 
TM6 region of the GPCR, near the TM6/IC3 interface) is mutated, most preferably to a 

15 lysine amino acid residue. 

1. Tranformer Site-Directed ™ Mutagenesis 
Preparation of non-endogenous human GPCRs may be accomplished on human 
GPCRs using Transformer Site-Directed™ Mutagenesis Kit (Clpntech) according to the. 
manufacturer instructions. Two mutagenesis primers are utilized, most preferably a lysine 

20 mutagenesis oligonucleotide that creates the lysine mutation, and a selection marker 
oligonucleotide. For convenience, the codon mutation to be incorporated into the human 
GPCR is also noted, in standard form (Table E): 
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TABLE E 



Receptor Identifier 



Codon Mutation 



10 



15 



20 



. hARE-3 


F313K 


hARE-4 ' 


• • . V233K 


hARE-5 ' 


'.' A240K 


hGPCR14 


L257K 


hGPCR27 / 


■ 7 '■ C283K 


hARE-1 


' E232K 


hARE-2 • •• . \ 


; G285K 


hPPRl • 


.'v L239K 


hG2A , - ' .'. 


. Ki32A 


hRUP3 


L224K 


hRUP5 


. A236K 


hRUP6 . 


N267K 


hRUP7 


A302K 


hCHN4 


V236K 


hMC4 


A244K 


hCHN3 


S284K 


hCHN6 


v v L352K 


hCHN8 : ' 


■ ; N235K 


. hCHN9 


G223K 


. hCHNIO 


L231K 


hH9 


F236K 



25 



The following GPCRs were mutated according with the above method using the 
designated sequence primers (Table F). 
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TABLE F 

Receptor Codon Lysine Mutagenesis 

Identifier Mutation (SEQ.ID.NO.) 

5'-3' orientation, mutation 
sequence underlined 

hRUP4 V272K CAGGAAGAAGAAACGAGC 

TGTCATTATGATGGTGACA 
GTG (83) 

hATl see below alternative approach; see below 
hGPR38 V297K GGCCACCGGCAGACCAAAC 

GCGTCCTGCTG (85) 
■hCCKB V332K alternative approach; see below 
hTDAG8 I225K GGAAAAGAAGAGAATCAA 

A A AACTACTTGTC AGCATC 

(87) 

hH9 F236K GCTGAGGTTCGCAATAAAC 

TAACCATGTTTGTG (143) 
hMC4. A244K GCCAATATGAAGGGA AAA 

ATTACCTTGACCATC (137) 



Selection Marker 

(SEQ.ID.NO.) 
. 5'-3' orientation 



CACTGTCACCATCATAATG 
ACAG CTCGTTTCTTCTTCC 
TG (84) 

alternative approach; see below 
CTCCTTCGGTCCTCCTATC 
GTTGTCAGAAGT (86) 
alternative approach; see below 
CTCCTTCGGTCCTCCTATC 
GTTGTCAGAAGT (88) 

CTCCTTCGGTCCTCCTATC 
GTTGTCAGAAGT (144) 
CTCCTTCGGTCCTCCTATC 
GTTGTCAGAAGT (138) 



The non-endogenous human GPCRs were then sequenced and the derived and 
verified nucleic acid and amino acid sequences are listed in the accompanying "Sequence 
Listing" appendix to this patent document, as summarized in Table G below: 

TABLE G 



15 


Non Endogenous Human 


Nucleic Acid Sequence Listing 


Amino Acid Sequence - 




GPCR 




Listing 




hRUP4 


SEQ.ID.NO.: 127 


SEQ.ID.NO.: 128 




(V272K) 








hATl 


(see alternative approaches 


(see alternative approaches, 


20 


(see alternative approaches 


below) 


below) 




below) 








hGPR38 


SEQ.ID.NO.: 129 


SEQ.ID.NO.: 130 




(V297K) 








hCCKB 


SEQ.ID.NO.: 131 


SEQ.ID.NO.: 132 


25 


(V332K) 








HTDAG8 


SEQ.ID.NO.: 133 


SEQ.ID.NO.: 134 




(I225K) 








hH9 


SEQ.ID.NO.: 141 


SEQ.ID.NO.: 142 




(F236K) 






30 


hMC4 


SEQ.ID.NO.: 135 


SEQ.ID.NO.: 136 




(A244K) 
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2. Alternative Approaches For Creation of , - 
Non-Endogenous Human GPCRs 

a. ATI \' : ■ ' 

..J. ._ - 1.F239K Mutation- - '-^.v- "il.-u - 

Preparation of a non-endogenous, constitutive iy activated human AT 1 receptor was 
accomplished by creating an F239K mutation (see, SEQ.ID.NO.: 89 for nucleic acid sequence, 
and SEQ.ID.NO.: 90 for amino acid sequence). Mutagenesis was performed using 
Transformer Site-Directed Mutagenesis™ Kit (Clontech) according to the to manufacturer's 
instructions. The two mutagenesis primers were used, a lysirie mutagenesis oligonucleotide 
(SEQ.IDiNO.: 91) and a selection marker oligonucleotide (SEQ.ID.NO.: 92), which had the 
following sequences: , 

5 -CCAAGAAATG ATGATATTAA AAA GATAATTAtGGC-3'' (SEQ.ID.NO.; 91) 
5'-CTCCTTCGGTCCTCCTATCGTTGTCAGAAGT-3' (SEQ.ID.NO.: 92), 
respectively. 

2. Nlll A Mutation 

Preparation of a non-endogenous human ATI receptor was also accomplished by 
creating an Nl 11 A mutation (see, SEQ.ID.NO. :93 for nucleic acid sequence, and . V 
SEQ.ID.NO.: 94 for amino acid sequence). Two PCR reactions were performed using pfu 
polymerase (Stratagene) with the buffer system provided by the manufacturer, 
supplemented with 10% DMSO, 0.25 'jiM of each primer, and 0.5 mM of each 4 1 
nucleotides. The 5' PCR sense primer used had the following sequence: 
5'-CCCAAGCTTCCCCAGGTGTATTTGAT-3' (SEQ.ID.NO.: 95) 
and the antisense primer had the following sequence: : / " 
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5'-CCTGCAGGCGAAACTGACTCTGGCTGAAG-3' (SEQ.ID.NO.: 96). 
The resulting 400 bp PCR fragment was digested with Hindlll site and subcloned into 
Hindlll-Smal site of pCMV vector (5' construct). The 3' PCR sense primer used had the 
following sequence: 

5 5'-CTGTACGCTAGTGTGTTTCTACTCACGTGTCTCAGCATTGAT-3'(SEQ.ID.NO.:97) 
and the antisense primer had the following sequence:. 
S'-GTTGGATCCACATAATGCATTTTCTCO: (SEQ.ID.NO.: 98) 

The resulting 880 bp PCR fragment was digested with BamHI and inserted into Pst 
(blunted by T4 polymerase) and BamHI site of 5' construct to generated the full length 

10 Nl 1 1 A.construct. The cycle condition was 25 cycles of 94°C for 1 min, 60°C for lmin 
and 72 °C for 1 min (5* PCR) or 1 .5 min (3' PCR). 

3. AT2K255IC3 Mutation 
Preparation of a non-endogenous, constitutively activated human ATI was 
accomplished by creating an AT2K255IC3 "domain swap" mutation (see, SEQ.ID.NO.:99 

15 for nucleic acid sequence, and SEQ.ID.NO.: 100 for amino ado* sequence). Restriction 
sites flanking IC3 of ATI were generated to facilitate replacement of the IC3 with 
corresponding IC3 from angiotensin II type 2 receptor (AT2). This was accomplished by 
performing two PCR reactions. A 5 ' PCR fragment (Fragment A) encoded from the 5' 
untranslated region to the beginning of IC3 was generated by utilizing SEQ.ID.NO.: 63 as 

20 sense primer and the following sequence: 

5'-TCCGAATTCCAAAATAACTTGTAAGAATGATCAGAAA-3' (SEQ.ID.NO.: 101) 
as antisense primer. A 3' PCR fragment (Fragment B) encoding from the end of IC3 to the 
3* untranslated region was generated by using the following sequence: 
5'-AGATCTTAAGAAGATAATTATGGCAATTGTGCT-3' (SEQ.ID.NO.: 102) 
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as sense-primer and SEQ.ID.NO.: 64 as antisense primer. The PCR condition was 30 - ' 
cycles of 94°C for 1 min, 55°C for l min and 72 °C for 1 .5 min using endogenous ATI . 
cDNA clone as template and pfti polymerase (Stratagene), with the buffer systems 
provided by the manufacturer, supplemented with 10% DMSO,* 0.25 |iM of each primer, 
and 0.5 mM of each 4 nucleotides. Fragment A (720 bp) was digested with Hindlll and 
EcoRI and subcloned. Fragment B was digested with BamHI and subcloned into pCMV 
vector with an EcoRI site 5* to the cloned PCR fragment. 

: The DNA fragment (Fragment C) encoding IC3 of AT2 with a L255K mutation . 
and containing an EcoRI cohesive end at 5 r and a Aflll cohesive end at 3\ was generated 
by annealing 2 synthetic oligonucleotides having the following sequences: 

S'AATTCGAAAACACTTACTGAAGACGAATAGCTATGGGAAGAACAGGATAACCGGTGACCAA 
G-3' (sense; SEQ.ID.NO;: 103) ;,\ \ 

5 , TTAACTTGGTCACGGGTTATCCTGTTCTTCCCATAGCTATTCGTCTTCAGT 

AAGTGTTTTCG-3 ' (antisense; SEQ.ID.NO.: 104). > - 

Fragment C was inserted in front of Fragment B through EcoRI and Aflll site. The 

resulting clone was then ligated with the Fragment A through the EcoRI site to generate ATI 

with AT2K255IC3. /■ ^>;\\"; V - ' ' 

4. A243+ Mutation : - : - . 

Preparation of a non-endogenous human ATI receptor was also accomplished by 
creating an A243+ mutation (see, SEQ.ID.NO.: 105 for nucleic acid sequence, and 
SEQ.ID.NO.: 1 06 for amino acid sequence). An A243+ mutation was constructed using the 
following PCR based strategy : + Two PGR reactions was performed using pfu polymerase 
(Stratagene) with the buffer system provided by the manufacturer supplemented with 10% 
DMSO, 0.25 (iM of each primer, and 0.5 mM of each 4 nucleotides. The 5' PCR sense primer 
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utilized had the following sequence: . 
5'-CCCAAGCTTCCCCAGGTGTATTTGAT-3; (SEQ.ID.no.: 107) 
and the antisense primer had the following sequence: 

S'-AAGCACAATTGCTGCATAATTATCTTAAAAATATCATC-S* (SEQ.ID.NO.: 108). 
5 The 3' PCR sense primer utilized had the following sequence: 

S'-AAGATAATTATGGCAGCAATTGTGCTTTTCTTTTTCTTT-S' (SEQ.ID.NO.: 109) 

containing the Ala insertion and antisense primer: . 

5'-GTTGGATCCACATAATGCATTTTCTC-3'(SEQ.ID.NO.: 110), 
, The cycle condition was 25 cycles of 94°C for 1 min, 54°C for 1 min and 72 °C for 1.5 min. 
10 An aliquot of the 5' and 3' PCR were then used as co-template to perform secondary PCR 

using the 5' PCR sense primer and 3' PCR antisense primer. The PCR condition was the 

same as primary PCR except the extention time was 2.5 min. The resulting PCR fragment 

was digested with Hindlll and BamHI and subcloned into pCMV vector. (See, 

SEQ.ID.NO.: 105) 

15 4. CCKB > 

Preparation of the non-endogenous, constitutively activated human CCKB receptor 
was accomplished by creating a V322K mutation (see, SEQ.ID.NO.: Ill for nucleic acid 
sequence and SEQ.ID.NO.: 1 12 for amino acid sequence). Mutagenesis was performed by 
PCR via amplification using the wildtype CCKB from Example 1 , 
20 The first PCR fragment (Ikb) was amplified by using SEQ.ID.NO.: 75 and an 

antisense primer comprising a V322K mutation: 

5'-CAGCAGCATGCGCTTCACGCGCTTCTTAGCCCAG-3' (SEQ.ID.NO.: 113). 

The second PCR fragment (0.44kb) was amplified by using a sense primer comprising the 

V322K mutation: 
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5 , -AGAAGCGCGTGAAGCGCATGCTGCTGGTGATCGTT-3 , (SEQ.ID.NO.: 1 14) arid SEQ.ID.NO.: 

• 76; V-y-.n ■'. 7 , v , ' ' "■ - " ~ y; * 

The two resulting PCR fragments were then used as template for amplifying CCKB 
comprising V332K, using SEQ.ID.NO.: 75 and SEQ.ID.NO.: 76 and the above-noted 
5 system and conditions. The resulting 1.44kb PCR fragment containing the V332K 
mutation was digested with Hindlll and EcoRJ and cloned into Hindlll-EcoRI site of 
pCMV expression vector. (See, SEQ.ID.NO.: 111). . : 

3. Y QuikChange™ Site-Directed™ Mutagenesis 

Preparation of non-endogenous human GPCRs can also be accomplished by using 
10 QuikChange™ Site-Directed™ Mutagenesis Kit (Stratagene, according to manufacturer's , 
instructions). Endogenous GPCR is preferably used as a template and two mutagenesis 
primers utilized, as well as, most preferably, a lysine mutagenesis oligonucleotide and a 
selection marker oligonucleotide (included in kit). For convenience, the codon mutation 
incorporated into the human GPCR and the respective oligonucleotides are noted, in standard 
15 form (Table H): : 
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Receptor C don' 
Identifier Mutation 



hCHN3 

hCHN6 
hCHN8 
hCHN9 
hCHNIO 



S284K 

L352K 
N235K 
G223K 
L231K 



,41- 

TABLEH 



Lysine Mutagenesis 
(SEQ.ID.NO.) 
5'-3' orientation, mutation 
underlined 

ATGGAGAAAAGAATCAAAAGAA 
TGTTCTATATA(115) 

CGCTCTCTGGCCTTGAAGCGCAC 
GCTCAGC(117) 

CCCAGGAAAAAGGTGAAAGTCA 
AAGTTTTC(119) 

GGGGCGCGGGTGAAACGGCTGG 
TGAGC(121) 

CCCCTTG A AAAG CCTAAGAACTT 
GGTCATC(123) 



Selection Marker 
(SEQ.ID.NO.) 
5'-3* orientation 



TATAT AGAACATTCTTTT 
GATTCTTTTCTCCAT 
(116) , 

GCTGAGCGTGCGCTTCA 
• AGGCGAG AGAGCG (118) 
GAAAACTTTGACTTTCAC 
C I 1 1 1 1 CCTGGG (120) 
GCTCACCAGCCGTTTCA 
CCCGCGCCCC (122) 
GATGACCAAGTTCTTAG 
GCTTTTCAAGGGG (124) 



Example 3 

Receptor Expression 

0 Although a variety of cells are available to the art for the expression of proteins, it is 

most preferred that mammalian cells be utilized. The primary reason for this is predicated 
upon practicalities, i.e., utilization of, e.g., yeast cells for the expression of a GPCR, while 
possible, introduces into the protocol a non-mammalian cell which may not (indeed, in the 
case of yeast, does riot) include the receptor-coupling, genetic-mechanism and secretary 

15 pathways that have evolved for mammalian systems - thus, results obtained in non- 
mammalian cells, while of potential use, are not as preferred as that obtained from mammalian 
cells. Of the mammalian cells, COS-7, 293 and 293T cells are particularly preferred, although 
the specific mammalian cell utilized can be predicated upon the particular needs of the artisan. 
On day one, 1X1 0 7 293T cells per 150mm plate were plated out. On day two, two 

20 reaction tubes were prepared (the proportions to follow for each tube are per plate): tube A 
was prepared by mixing 20ug DNA (e.g., pCMV vector; pCMV vector with receptor 
cDNA, etc.) in 1 .2ml serum free DMEM (Irvine Scientific, Irvine, CA); tube B was 
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^prepared by mixing 120^1 lipofectamine (Gibcb BRL) in 1.2ml serum free DMEM. Tubes 
A and B were admixed by inversions (several times), followed by incubation at room ' , 
temperature for 30-45min. The admixture is referred to as the "transfection mixture". \ . 
Plated 293T cells were washed with 1XPBS, followed by addition of 1 0ml serum free 
5 DMEM 2.4ml of the transfection mixture were added to . the cells, followed by incubation 
for 4hrs at 37°C/5% C0 2 . The transfection mixture was removed by aspiration, followed 
by the addition of 25ml of DMEM/10% Fetal Bovine Serum, Cells were incubated at : 
37°C/5% C0 2 . After 72hr incubation, cells were han'ested and utilized for analysis. • 
Example 4 

o assays For determination of Constitutive Activity 
of Non-Endogenous GPCRs 

A variety of approaches are available for assessment of constitutive activity of the * 
non-endogenous human ^^ GPCRs: The following are illustrative; those of or^^^ 
" the art are credited with the ability to determine those techniques that are preferentially 

15 beneficial for the needs of the artisan. 

1. Membrane Binding Assays: [ 35 S]GTPyS Assay 

When a G protein-coupled receptor is in its active state, either as a result of ligand 
binding or constitutive activation,, the receptor couples to a G protein and stimulates the 
release of GDP and subsequent binding of GTP to the G protein. The alpha subunit of the G 
20 protein-receptor complex acts as a GTPase and slowly hydrolyzes the GTP to GDP, at which 
point the receptor normally "is deactivated. Constitutively activated receptors continue to 
exchange GDP for GTP. The non-hydrolyzable GTP analog, [ 35 S]GTPyS, can be utilized to 
demonstrate enhanced binding of [ 35 S]GTP Y S to membranes expressing constitutively, 
activated receptors; The advantage of using [ 35 S]GTPyS binding to measure constitutive 
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^re- 
activation is that: (a) it is generically applicable to all G protein-coupled receptors; (b) it is 
proximal at the membrane surface making it less likely to pick-up molecules which affect the 
intracellular cascade, .. 

The assay utilizes the ability of G protein coupled receptors to, stimulate [ 35 S]GTPyS 
5 binding to membranes expressing the relevant receptors. The assay can, therefore, be used in 
the direct identification method to screen candidate compounds to known, orphan and 
constitutively activated G protein-coupled receptors.. The assay is generic and has application 
to drug discovery at all G protein-coupled receptors. 

The [ 35 S]GTPyS assay can be incubated in 20 mM HEPES and between 1 and about 
10 20mM MgCl 2 (this amount can be adjusted for optimization of results, although 20mM is 
preferred) pH 7.4, binding buffer with between about 0.3 and about 1.2 nM [ 35 S]GTPyS (this 
. amount can be adjusted for optimization of results, although 1 .2 is preferred ) and 1 2.5 to 75 
jig membrane protein (e.g, COS-7 cells expressing the receptor; this amount can be adjusted 
for optimization, although 15\xg is preferred) and 1 |iM GDP (this amount can be changed for 
15 optimization) for 1 hour. Wheatgerm agglutinin beads (25 jil; Amersham) should then be 
added and the mixture incubated for another 30 minutes at room temperature. The tubes are 
then centrifuged at 1500 x g for 5 minutes at room temperature and then counted in a 
scintillation counter. 

A less costly but equally applicable alternative has been identified which also meets 
20 the needs of large scale screening. Flash plates™ .and Wallac™ scintistrips may be utilized 
to format a high throughput [ 35 S]GTPyS binding assay. Furthermore, using this technique, 
the assay can be utilized for known GPCRs to simultaneously monitor tritiated ligand binding 
to the receptor at the same time as monitoring the efficacy via [ 35 S]GTPyS binding. This is 
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possible because the Wallac beta counter can switch energy windows to look at both tritium 
and 35 S-labeled. probes. This assay, may also be used to detect other types of membrane 
activation events resulting in receptor activation. For example; the assay may be used to 
monitor 32 P phosphorylation of a variety of receptors (both G protein coupled arid tyrosine 
5 kinase receptors). When the membranes are centrifuged to the bottom of the well, the bound 
[ 35 S]GTPyS or the 32 P-phosphorylated receptor will activate the scintillant which is coated of 
the wells. Scinti® strips (Wallac) have been used to demonstrate this principle. In addition, the 
assay also has utility for measuring ligand binding to receptors using radioactively labeled 
ligands. In a similar manner, when the radiolabeled bound ligand is centrifuged to the bottom 
10 of the well, the scintistrip label comes into proximity with the radiolabeled ligand resulting 
in activation and detection. : 
2. Adenylyl Cyclase 
, A Flash Plate™ Adenylyl Cyclase kit (New England Nuclear; Cat. No. SMP004A) 
designed for cell-based assays can be modified for use with crude plasma membranes. The 
15 Flash Plate wells contain a scintillant coating which also contains a specific antibody 
, recognizing cAMP. The cAMP generated in the wells was "quantitatecl by a direct 
competition for binding of radioactive cAMP tracer to the cAMP antibody. The following 
serves as a brief protocol for the measurement of changes in c AMP levels in membranes that 
express the receptors. 

20 Transfected cells are harvested approximately three days after transfection. 

; Membranes were prepared by honiogenizatibn of suspended cells in buffer containing 20mM 
HEPES, pH 7.4 and l dmM MgCl 2 . Hbmbgenization is performed on ice using a Brinkman 
■ Polytron™ for approximately 10 seconds, The resulting homogeriate is centrifuged at 49,000 
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X g for 15 minutes at 4°C. , The resulting pellet is then resuspended in buffer containing 
20mM HEPES, pH 7.4 and 0.L mM EDTA,. homogenized for 10 seconds, followed by 
centrifugation at 49,000 X g for 15 minutes at 4°C. The resulting pellet can be stored at - 
. 80 °C until utilized. On the day of measurement, the membrane pellet is slowly thawed at 
5 room temperature, resuspended in , buffer containing 20mM HEPES, pH 7.4.. and lOmM 
MgCI^ (these amounts can be optimized, although the values listed herein are preferred), to 
yield a final protein concentration of 0.60mg/ml (the resuspended membranes were placed 
on ice until use). . . „ . 

cAMP standards and Detection Buffer (comprising 2 /zCi of tracer [ ,25 I cAMP (100 
10 /A] to 11 ml Detection Buffer) are prepared and maintained in accordance with the 
manufacturer's instructions. Assay Buffer is prepared fresh for screening and contained 
20mM HEPES, pH 7.4, lOmM MgCl 2 , 20mM (Sigma), 0.1 units/ml creatine phosphokinase 
(Sigma), 50 fuM GTP (Sigma), and 0.2 mM ATP (Sigma); Assay Buffer can be stored on ice 
until utilized. The assay is initiated by addition of 50ul of assay buffer followed by addition 
15 of 50ul of membrane suspension to the NEN Flash Plate. The resultant assay mixture is* 
incubated for 60 minutes at room temperature followed by addition of lOOul of detection 
buffer. Plates are then incubated an additional 2-4 hours followed by counting in a Wallac 
MicroBeta™ scintillation counter. Values of cAMP/well are extrapolated from a standard 
c AMP curve that is contained within each assay plate. 
20, C. Reporter-Based Assays 

1. CREB Reporter Assay (Gs-associated receptors) 
A method to detect Gs stimulation depends on the known property of the transcription 
factor CREB, which is activated in a cAMP-dependent manner. A PathDetect™ CREB trans- 
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Reporting System (Stratagene, Catalogue # 219010) can utilized to assay for ,Gs coupled 
activity in 293 or 293T cells. Cells are transfected with the plasmids components of this 
above system and the indicated expression plasmid encoding endogenous or mutant receptor 
using a Mammalian ~T^^ to the 

. manufe^ rig pFR-Luc (luciferase reporter plasmid containing 

Gal4 recognition sequences), 40 ng pFA2-CREB (Gal4-CREB fusion protein containing the 
Gal4 DNA-binding domain), 80 ng pCMV-receptor expression plasmid (comprising the 
receptor) and 20 ng CMV-SEAP (secreted alkaline phosphatase expression plasmid; alkaline 
phosphatase activity is measured in the media of transfected cells to control for variations in 
transfection efficiency between samples) are combined in a calcium phosphate precipitate as 
per the Kit's instructions. Half of the precipitate is equally distributed over 3 wells in a 96- 
well plate, kept on the cells overnight, and replaced with fresh medium the following morning. 
Forty-eight (48) hr after the start of the transfection, cells are treated and assayed for, e.g.; 
luciferase activity : ' • \- ' " * 

2. API reporter assay (Gq-associated receptors) 

' A method to detect Gq stimulation depends on the known property of Gq-dependent 
phospholipase C to cause the activation of genes containing API elements in their promoter. 
A Pathdetect™ AP-1 cis-Repprting System (Stratagene, Catalogue # 219073) can be utilized 
following the protocol set forth above with respect to the CREB reporter assay, except that 
the components of the calcium phosphate precipitate Were 410 ng pAPl-Luc, 80 ng pCMV- 
receptor expression plasmid, and 20 ng CMV-SEAP. 

3. Cre-Luc Reporter Assay 
293 and 293T cells are plated-out on 96 well plates at a density of 2 x 10 4 cells per 
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well and were transfected using Lipofectamine Reagent (BRL) the following day according 
to manufacturer instructions. A DNA/lipid mixture is prepared for each 6-well transfection 
as follows: 260ng of plasmid DNA in 100>il of DMEM were gently mixed with 2|il of lipid 
in 1 OOjil of DMEM (the 260ng of plasmid DNA consisted of 200ng of a 8xCRE-Luc reporter 

5 plasmid (see below and Figure 1 for a representation of a portion of the plasmid), 50ng of 
pCMV comprising endogenous receptor or non-endogenous receptor or pCMV alone, and 
lOng of a GPRS expression plasmid (GPRS in pcDNA3 (Invitrogen)). The 8XCRE-Luc 
reporter plasmid was prepared as follows: vector SRIF-p-gal was obtained by cloning the rat 
somatostatin promoter (-7 1 /+5 1 ) at Bgl V THindlll site in the pPgal-Basic Vector (Clontech). 

10, Eight (8) copies of cAMP response element were obtained by PCR from an adenovirus 
template AdpCF126CCRE8 (see, 7 Human Gene Therapy 1883 (1996)) and cloned intp the 
SRIF-P-gal vector at the Kpn-BglV site, resulting in the 8xCRE-p-gal reporter vector. -The 
8xCRE-Luc reporter plasmid was generated by replacing the beta-galactosidase gene in the 
8xCRE-p-gal reporter vector with the Iuciferase gene obtained from the pGL3-basic vector 

15 (Promega) at the Hindlll-BamHI site. Following 30 min. incubation at room temperature, the 
DNA/lipid mixture was diluted with 400 of DMEM and 1 00^1 of the diluted mixture was 
. added to each well. 100 ^1 of DMEM with 10% FCS were added to each well after a 4hr 
incubation in a cell culture incubator. The following day the transfected cells were changed 
with 200 ^il/well of DMEM with 10% FCS. Eight (8) hours later, the wells were changed to 

20 1 00 \x\ /well of DMEM without phenol red, after one wash with PBS. Luciferase activity were 
measured the next day using the LucLite™ reporter gene assay kit (Packard) following 
manufacturer instructions and read on a 1450 MicroBeta™ scintillation and luminescence 
counter (Wallac). 
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•4. Srf-Luc Reporter Assay .■:■;*": .7- ;'. ,. : ; .. 

' One method to detect Gq stimulation depends on the kriowTi property o 

phospholipase C to cause the activation of genes containing serum response factors in their 
promoter, A Pathdetect™ SRE-Luc,Repbrting System:(Stratag^ . 
5 for Gq coupled activity , in, e.g., COS7 cells, /Cells are transfected with the plasmid 
components of the system and the indicated expression plasmid encoding endogenous or non- 
endogenous GPCR using a Mammalian Transfection™ Kit (Stratagene, Catalogue #200285) 
according to the manufacturer's instructions. Brieny;410ngSRF-tuc ; 80ngpCMy-receptor 
' expression plasmid and 20 ngCMV-SEAP (secreted alkaline phosphatase expression 
10 alkaline phosphatase activity is measured in the media of transfected cells to control for 
: variations in transfection efficiency between samples) are combined in a calcium phosphate 
precipitate as per the manufacturer's instructions: Half of the precipitate is equally distributed 
\ \ over 3 wells in a 96-well plate, kept on the cells in a serum free media for 24 hours. The last 
5 hours the cells are incubated with luM Angiotensin, where indicated. Cells are then lysed 
15 and assayed for luciferase activity using a Luclite™ Kit (Packard, Cat. # 601 691 1) and "Trilux 
1450 Microbeta" liquid scintillation and luminescence counter (Wallac) as per the 
manufacturer's' instructions. the data can be analyzed using GraphPad Prism™ 2.0a 
■ (GraphPad Software Inc.). 

5. Intracellular IP 3 Accumulation Assay 
20 . On day 1 , cells comprising the receptors (endogenous and/or non-endogenous) can 

, . be plated onto 24 well plates; usually lxl 0 5 cells/well (although his umber can be 

optimized. On day 2 cells can be transfected by firstly mixing 0.25ug DNA in 50 ul serum 
- free DMEM/well and 2 ul lipofectamine in 50 y\ serumfree DMEM/weli: The solutions 
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are gently mixed and incubated for 15-30 min at room temperature. Cells are washed with 
0.5 ml PBS and 400 /il of serum free media is mixed with the transfection media and 
added to the cells. The cells are then incubated for 3-4 hrs at 37°C/5°/oC0 2 and then the 
\ transfection-media is removed and replaced with 1 ml/well of regular growth media. On 

5 day 3 the cells are labeled with 3 H-myo-inositol. Briefly, the media is removed and the 
- cells are washed with 0.5 ml PBS. Then 0.5 ml inositol-free/serum free media (GIBCO 
BRL) is added/well with 0.25 uCi of 3 H-myo-inositol / well and the cells are incubated for 
16-1 8 hrs o/n at 37°C/5%C0 2 . On Day 4 the cells are washed with 0.5 ml PBS and 0.45 
'ml of assay medium is added containing inositol-free/serum free media 10 /^M pargyline 

10 10 mM lithium chloride or 0.4 ml of assay medium and 50 ul of 1 Ox ketanserin (ket) to 
final concentration of 10/<iM. The cells are then incubated for 30 min at 37°C. The cells 
are then washed with 0.5 ml PBSand 200 ul of fresh/icecold stop solution (1 M KOH; 48 
mM Na-borate; 3.8 mM EDTA) is added/well. The solution is kept on ice for 5-10 min or 
until cells were lysed arid then neutralized by 200 }A of fresh/ice cold neutralization sol.* 

15 (7.5 % HCL). The lysate is then transferred into 1 .5 ml eppendorf tubes and 1 ml of 
chloroform/methanol (1:2) is added/tube. The solution is vortexed for 15 sec and the 
upper phase is applied to a Biorad AG 1 -X8™ anion exchange resin (1 00-200 mesh). 
Firstly, the resin is washed with water at 1 :1 .25 W/V and 0.9 ml of upper phase is loaded 
onto the column. The column is washed with 10 mis of 5 mM myo-inositol and 10 ml of 5 

20 mM Na-borate/60mM Na-formate. The inositol tris phosphates are eluted into scintillation 
vials containing 10 ml of scintillation cocktail with 2 ml of 0.1 M formic acid/ 1 M 
ammonium formate. The columns are regenerated by washing with 10 ml of 0.1 M formic 
acid/3M ammonium formate and rinsed twice with dd H 2 0 and stored at 4°C in water. 
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Exemplary results are presented below in Table I: 
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TABLE I 



Receptor 



hATl 



5 hTDAG8 



hH9 
hCCKB 



Mutation 




Signal 


7 Signal" 


'Percent " 


Utilized 


Generated: 


Generated: 


Difference 






Endogenous . 


. ;;:Npn-_. r _-. 








Version 


Endogenous 








(Relative 


. . Version • . 








Light Units) 


(Relative ; ■ 










Light Units) 




F239K 


SRF-LUC 


34 .'. 


.. 137 < 


75%1 


AT2K255IC3 ' 


SRF-LUC 


■■■ 34 


127 


73%1 


I225K : 


CRE-LUC '■■ 


2,715 , 


' 14,440 ' 


; 81%1 • 




(293 cells) 7 








I225K : 


CRE-LUC 


65,681 


185,636 ' 


65% t 




(293T cells) 








F236K 


CRE-LUC 


1,887 


6,096 


69% 1 


V332K 


CRE-LUC 


785 


3,223 


76% t 



G. Cell-Based Detection Assay (Example -Tdag8) 
10 293 cells were plated-out on 150nim plates at a density of 13 x l^ 

V were transfected using 12ug of the respective DNA and 60ul of Lipofectamine Reagent 
(BRL) per plate. The transfected cells were grown in media containing serum for an assay 
performed 24 hours post-transfection. For detection assay performed 48 hours post- ; . 
transaction (assay comparing serum and serum-free media; see Figure 3). the initial media 
'l 5 was changed to either serum or serum-free media. The serum.free media was comprised solely 
■: of Dulbecco's Modified Eagle's (DME) High Glucose Medium (Irvine Scientific #9024). In 
addition to the above DME Medium, the media with serum contained the following: 10% 
Fetal Bovine Serum XHyclone #SH30071.03), 1% of lOOmM Sodium Pyruvate (Irvine 
Scientific #9334), 1 % of 20mM L-Glutamine (Irvine Scientific #93 17), and 1 % of Penicillin- 
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Streptomycin solution (Irvine Scientific #9366). 

A 96-well Adenylyl Cyclase Activation Flashplate™ was used (NEN: #SMP004A). 
First, 50ul of the standards for the assay were added to the plate, in duplicate, ranging from 
concentrations of SOpmol to zero pmol cAMP per well. The standard cAMP (NEN: 

5 #SMP004A) was reconstituted in water, and serial dilutions were made using IxPBS (Irvine 
Scientific: #9240). Next, 50ul of the stimulation buffer (NEN: #SMP004A) was added to all 
wells. In the case, of using compounds to measure activation or inactivation of cAMP, lOul 
of each compound, diluted in water, was added to its respective well, in triplicate. Various 
final concentrations used range from luM up to ImM. Adenosine 5 '-triphosphate, ATP, 

10 (Research Biochemicals International: #A- 141) arid Adenosine 5' -diphosphate, ADP, (Sigma: 
#A2754) were used in the assay. Next, the 293 cells transfected with the respective cDNA 
(CMV or TDAG8) were harvested 24 (assay detection in serum media) or 48 hours post- 
transfection (assay detection comparing serum and serum-free media). The media was 
aspirated and the cells washed once with 1 xPBS. Then 5ml of 1 xPBS was added to the cells 

15 along with 3ml of cell dissociation buffer (Sigma: #C-1544). The detached cells were 
transferred to a centrifuge tube and centrifuged at room temperature for five minutes. The 
supernatant was iremoved and the cell pellet was resuspended in an appropriate amount of 
IxPBS to obtain a final concentration of 2x1 0 6 cells per milliliter. To the wells containing the 
compound, 5 Qui of the cells in IxPBS (lxl 0 5 cells/well) were added. The plate was incubated 

20 on a shaker for 15 minutes at room temperature. The detection buffer containing the tracer 
cAMP was prepared. In 1 1ml of detection buffer (NEN: #SMP004A), 50ul (equal to 1 uCi) 
of [ 125 I]cAMP (NEN: #SMP004A) was added. Following incubation, 50ul of this detection 
buffer containing tracer cAMP was added to each well. The plate was placed on a shaker and 
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incubated at room temperature for tvVo hours. Finally, the solution from the wells of the plate 
,ere aspirated and the flashplate was counted using the Wallac MicroBeta^ scintillation 



w 



".'counter. ' ' , : y; \ ' 

^ - in Figure 2A, ATP and ADP bind to endogenous TDAG8 resulting in an increase ~ • 

5 bfcAMP of about 59% and about 55% respectively. Figure 2B ^ 
1 binding to endogenous TDAG8 where endogenous TD AG 8 was transfected and grown in 
serum and serum-free medium. ATP binding to endogenous TDAG8 grown in serum . ; 
media evidences an increase in cAMP of about 65%, compared to the endogenous TDAG8 
■ ^ ' with no compounds; in serum-free media there was an increase of about 68%. ADP 
10 binding to endogenous TDAG8 in serum evidences about a 61% increase, while in serum- 
free ADP binding evidences an increase of about,62% increase. ATP and ADP bind to ; 
; endogenous TDAG8 with an EC50 value of 1 39.8uM and 1 20.5uM, respectively (data not ; 

shown). : • "•' -c :■ 

' ' Although the results presented in Figure 2B indicate substantially the same results 

15 when serum and serunvfree media were compared, our choice is to use a serum based 
media, although a serum-free media can also be utilized. 
Example 6 

GPCR Fusion Protein Preparation : 

The design of the constitutive! y activated GPCR-G protein fusion construct was 
20 accomplished as follows: both the 5' and 3' ends of the rat G protein Gsa (long form; Itoh, 
H. et al., 83 PNAS ,3776 (1986)) were engineered to include a Hindlll (5'-AAGCTT-3') 
: sequence thereon. Following confirmation of the correct sequence (including the flanking 
. HindlH sequences), 'the entire sequence was shuttled into pcDNA3.1(-) (Invitrogen, cat. no'. 
V795-20) by : subcloning using the Hindlll restriction site of that vector. The correct 
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orientation for the Gsa sequence was determined after subcloning into pcDNA3.1(-). The 
.. modified pcDNA3 .1 (-) containing the rat Gsa gene at Hindlll sequence was then verified; this 
vector was now available as a "universal" Gsa protein vector. The pcDNA3.1(-) vector 
contains a variety of well-known restriction sites upstream of the Hindlll site, thus 
- 5 beneficially providing the ability to insert, upstream of the Gs protein, the coding sequence 
of an endogenous, constitutively active GPCR. This same approach can be utilized to create 
other "universal" G protein vectors, and, of course, other commercially available or 
proprietary vectors known to the artisan can be utilized - the important criteria is that the 
, - sequence for the GPCR be upstream and in-frame with that of the G protein. 
10. ■ • / Y ■ TDAG8 couples via Gs, while H9 couples via Gz. For the following exemplary GPCR 
Fusion Proteins^ fusion to Gsa was accomplished. 

A TDAG8(I225K)-Gsa Fusion Protein construct was made as follows: primers were 
designed as follows: 

5'-gatcTCTAGAATGAACAGCACATGTATTGAAG-3' (SEQ.ID.NO.: 125; sense) 
15 S'-ctagGGTACCCGCTCAAGGACCTCTAATTCCATAG-S' (SEQ.ID.NO.: 126; antisense). 

Nucleotides in lower caps are included as spacers in the restriction sites between the 

G protein and TDAG8. The sense and anti-sense primers included the restriction sites for 

Xbal and Kpnl, respectively. 

PCR was then utilized to secure the respective receptor sequences for fusion within 

20 the Gsa universal vector disclosed above, using the following protocol for each: lOOng cDNA 

for TDAG8 was added to separate tubes containing 2ul of each primer (sense and anti-sense), 

3uL of 1 OmM dNTPs, 1 OuL of 1 OXTaqPlus™ Precision buffer, 1 uL of TaqPlus™ Precision 

polymerase (Stratagene: #6002 1 1 ), and 8 OuL of water. Reaction temperatures and cycle times 

for TD AG8 were as follows: the initial denaturing step was done it 94 °C for five minutes, and 
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a cycle of 94° C for 30 seconds; 55°C for 30 seconds; 72°C for two minutes. A final 
extension time was done at 72°C for ten minutes. PCR product for was run on a 1% agarose 
gel and then purified (data not shown). The purified product was digested with Xbal and 

. ; . ..,, y ^ T - e ^ or g t respective restriction site. * ' The" positive ' clones was isolated following - ■ 
: ' transformation and determined by restriction enzyme digest; expression using 293 cells was 

accomplished following the protocol set forth infra. Each positive clone for TDAG8:Gs - ; 
Fusion Protein was sequenced to verify correctness. 

GPCR Fusion Proteins comprising non-endogenous, constitutively activated 
.10 TDAG8(I225K) were analyzed as above and verified for constitutive activation. 

An H9(F236K)-Gsa Fusion Protein construct was made as follows: primers were 

. - designed as follows: 

5'-TfAgatatcGGGGGCCACCCTAGCGGT-3' (SEQ.ID.NO.: 145; sense) * 
5'-ggtacCCCCAGAGCCATTTCATCAGGATC-3' (SEQ.ID.NO.: 146; antisense). 
,. 15 - Nucleotides in lower caps are included as spacers in the restriction sites between the 
• G protein and H9. The sense and anti-sense primers included the restriction sites for EcoRV 
and Kpnl, respectively such that spacere (attributed to the restriction sites) exists between the 

G protein and H9. • 

PCR was then utilized to secure the respective receptor sequences for fusion within 
. 20 the Gsa universal vector disclosed above, using the following protocol for each: 80ng cDNA 

for H9 was added to separate tubes containing lOOng of each primer (sense. and anti-sense), 

and 45uL of PCR Supermix™ (Gibco-Brl, LifeTech) (50ul total reaction volume). Reaction 
- . - •' temperatures and cycle times for H9 were as follo\vs: the initial denaturing step was done it 

94°C for one, and a cycle of 94°C for 30 seconds: 55 °C for 30 seconds: 72°C for two 
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minutes...A final extension time was done at 72°C for seven minutes. PCR product for was 
run on a 1 % agarose gel and then purified (data not shown), The purified product was cloned 
into pCRII-TOPO™ System followed by identification of positive clones. Positive clones 
were isolated, digested with EcoRV and Kpnl (New England Biolabs) and the desired inserts ^ 
5 were isolated, purified and ligated into the Gs universal vector at the respective restriction site. 
The positive .clones was isolated following, transformation and determined by restriction 
enzyme digest; expression using 293 cells was accomplished following the protocol set forth 
infra. Each positive clone for H9(F236K):Gs - Fusion Protein was sequenced to verify 
'.; correctness, Membranes were frozen (-80 °C) until utilized. 

10 . To ascertain the ability of measuring a cAMP response mediated by the Gs protein 
(even though H9 couples with Gz), the following cAMP membrane assay was utilized, based 
upon an NEN Adenyl Cyclase Activation Flahplate™ Assay kit (96 well format). "Binding 
Buffer" consisted of lOmMHEPES, lOOmMNaCland 10mMMgCl(ph7.4). "Regeneration 
Buffer" was prepared in Binding Buffer and consisted of 20mM phosphocreatine, 20U 

15 creatine phosphokinase, 20uM GTP, 0.2mM ATP, and 0.6mM IBMX. "cAMP Standards" 

were prepared in Binding Buffer as follows: 

cAMP Stock Addedto" * Final Assay Concentration 

(5,000 pmol/ml in 2ml H 2 0) indicted amount of Binding (50ul into lOOul) 

- in ul Buffer to achieve indicated pmol/well 



20 A 250 , 1ml 50 

B 500 of A 500ul 25 

C 500ofB \. ' : 500ul ' 12.5 

D 500 of C 750ul 5.0 

E 500 of D . 5.00ul ' 2.5 

25 F 500 of E 500ul 1.25 

G 500 ofF 750ul : 0.5 



Frozen membranes (both pCMV as control and the non-endogenous H(-Gs Fusion 
Protein) were thawed (on ice at room temperature until in solution). Membranes were 
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v-' homogenized with a polytron until in suspension (2 x 15 seconds). ' Membrane protein < 

concentration was determined using the Bradford Assay Protocol (see infra): Membrane • 

concentration was diluted to 0.5mg/ml in Regeneration Buffer (final assay concentration - 

[ "'" r \ " ■ 25ug/well)! thereafter, 50ul of Binding Buffer^ 

5 of cAMP standard was added to wells 1 1 and 12 A-G, with Binding Buffer alone to 12H (on /..; 

" ; ; ; : the 96- well format). Thereafter, 50ul/well of protein was added to the wells and incubated at / 

room temperature (on shaker) for 60min. i00ul[ I25 i]cAMP in Detection Butter (see infra) was / 

; ; added to each well (final - 50ul[ l25 I]cAMP into 11 ml." Detection Buffer). These were 

incubated for 2hrs at room temperature. Plates were aspirated with an 8 channel manifold and 

10 sealed with plate covers. Results (pmoles cAMP bound) were read in a Wallac™ '1450 on 

M prot # 1 5). Results are presented in Figure 3 . 

The results presented in Figure 3 indicate that the Gs coupled fusion was able to 

"drive" the cyclase reaction such that measurement of the consitutive activation of H9(F236K) > 

was viable. Based upon these results, the direct identification of candidate compounds that 

15 are inverse agonists, agonists and partial agonists is possible using a cyclase-based assay. 

Example 6 • . •., • 

Protocol: Direct Identification of Inverse. Agonists and Agonists Using [ 35 S]GTPyS 

Although we have utilized endogenous, constitutively active GPCRs for the direct ; 

identification of candidate compounds as, e.g., inverse agonists, for reasons that are not 

20 altogether understood, intra-assay variation can become exacerbated. Preferably, then, a 

.GPCR Fusion Protein, as disclosed above, is also utilized with a non-endogenous, 

constitutively activated GPCR. We have determined that when such a pro 

... assay variation appears to be substantially stabilized, whereby an effective signal-to-noise 

- ~ - .~ ~ll^atio r is obtained.- This has the beneficialresult of allowing. for a.more robust identification™ „ _ 
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of candidate compounds. Thus, it is preferred that for direct identification, a GPCR Fusion 
Protein be used and that when utilized, the following assay protocols be utilized. . 
Membrane Preparation 
. - - Membranes comprising the non-endogenous, constitutively active orphan GPCR 
5 Fusion Protein of interest and for use in the direct; identification of candidate compounds as 
inverse agonists, agonists or partial agonists are preferably prepared as follows: 

a. Materials 

"Membrane Scrape Buffer" is comprised of 20mM HEPES and 1 OmM EDTA, pH 7.4; 
"Membrane Wash Buffer" is comprised of 20 mM HEPES and 0.1 mM EDTA, pH 7.4; 
10 "Binding Buffer" is comprised of20mM HEPES, 100 mMNaCl, and 10mMMgCl 2 ,pH7.4 ■ 

b. Procedure 

All materials are kept on ice throughout the procedure. Firstly, the media is aspirated 
from a confluent monolayer of cells, followed by rinse with 10ml cold PBS, followed by 
aspiration. Thereafter, 5ml of Membrane Scrape Buffer is added to scrape cells; this is 

1 5 followed by transfer of cellular extract into 50ml centrifuge tubes (centrifuged at 20,000 rpm 
for 17 minutes at 4°C). Thereafter, the supernatant is aspirated and the pellet is resuspended 
in 30ml Membrane Wash Buffer followed Iby centrifuge at 20,000 rpm for 17 minutes at 4°C. 
The supernatant is then aspirated and the pellet resuspended in Binding Buffer. This is then 
homogenized using a Brinkman polytron™ homdgenizer (15-20 second bursts until the all 

20 material is in suspension)^ This is referred to herein as "Membrane Protein"; 
Bradford Protein Assay 

Following the homogenization, protein concentration of the membranes is determined 
using the Bradford Protein Assay (protein can be diluted to about 1.5mg/ml, aliquoted and 
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frozen (-80°C) for later use; when frozen, protocol Tor use is as follows: on the day of the 
assay, frozen Membrane Protein is thawed at room temperature; followed by vortex and then : 
homogenized with a polytron at about 12x1 ,000 rpm for about 5-10 seconds; it is noted that 
for multiple preparations,' the " homogenizor should be thoroughly cleaned between ~ 
homoginezation of different preparations). ~ - .... 

a. '■. Materials 

Binding Buffer (as per above); Bradford Dye Reagent; Bradford Protein Standard are 
utilized, following manufacturer instructions (Biorad, cat: no. 500-0006). • ; 

b. Procedure " 

7 Duplicate tubes are prepared, one including the membrane, and one as a control 
"blank". Each contained 800ul Binding Buffer. Thereafter, 10ul of Bradford Protem 
(img/ml) is added to each tube, and lOul of membrane Protein is then added to just one tube 
(not the blank). Thereafter, 200ul ofBradfordDye Reagent is added to each tube, followed ; 
by vortex of each. After five (5) minutes, the tubes were re-vortexed and the material therein 
is transferred to cuvettes. The cuvettes are then read using a CECIL 304 1 spectrophotometer, 

at wavelength 595. 

Direct Identification Assay 

a. Materials 

GDP Buffer consists of 37.5 ml Binding Buffer and 2rhg GDP (Sigma, cat. no. G- 
7127), followed by a series of dilutions In Binding Buffer to obtain 0.2 uM GDP (final 
concentration of GDP. in each well was 0.1 uM GDP); each well comprising a candidate 
compound, has a final volume of 200ul consisting of lOOul GDP Buffer (final concentration, 
O.luM GDP), 50ul Membrane Protein in Binding Buffer, and 5Qul [ 35 S]GTPy.S (0.6 nM) in 
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Binding Buffer (2.5 ul [ 35 S]GTPyS per 1 0ml Binding Buffer), 
b. Procedure 

Candidate compounds are preferably screened using a 96-well plate format (these can 
be frozen at -80°C). Membrane Protein (or membranes with expression vector excluding the 
5 GPCR Fusion Protein, as control), are homogenized briefly until in suspension. Protein 
concentration is then determined using the Bradford Protein Assay set forth above. Membrane 
Protein (and control) is then., diluted to 0.25mg/ml in Binding Buffer (final assay 
concentration, 1 2.5ug/well). Thereafter, 1 00 ul GDP Buffer is added to each well of a Wallac 
Scintistrip™ (Wallac). A 5ul pin-tool is then used to transfer 5 ul of a candidate compound 

10 into such well (i.e., 5ul in total assay volume of 200 ul is a 1:40 ratio such that the final 
screening concentration of the candidate compound is 1 OuM). Again, to avoid contamination, 
after each transfer step the pin tool should be rinsed in three reservoirs comprising water ( 1 X), 
ethanol (IX) and water (2X) - excess liquid should be shaken from the tool after each rinse 
and dried with paper and kimwipes. Thereafter, 50 ul of Membrane Protein is added to each 

1 5 well (a control well comprising membranes without the GPCR Fusion Protein is also utilized), 
andpre-incubated for 5-10minutes at room temperature. Thereafter, 50 ul of [ 35 S]GTPyS (0.6 
nM) in Binding Buffer is added to each well, followed by incubation on a shaker for 60 
minutes at room temperature (again, in this example, plates were covered with foil). The 
assay is then stopped by spinning of the plates at 4000 RPM for 15 minutes at 22°C. The 

20 plates are then aspirated with an 8 channel manifold and sealed with plate covers. The plates 
are then read on a Wallace 1450 using setting "Prot. #37" (as per manufacturer instructions). 
Example 7 

Protocol: Coiifirmation Assay 

Using an independent assay approach to provide confirmation of a directly identified 
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candidate compound as set forth' above, it is preferred -that a confirmation assay then be 
utilized. In this case,- the preferred confirmation assay is a cyclase-based assay, 

A modified Flash Plate™ Adenylyl Cyclase kit (New England Nuclear; Cat. No. 
- - SMPG04A) is preferably utilized for confirmation of candidate compounds.directly.identified_ _ 
5 as inverse agonists and agonists to non-endogenous, constitutively activated orphan GPCRs 
: in accordance vvith the following protocol. •. • 
: transfected cells are harvested ' approximately three days after transfectibn. 

Membranes are prepared by homogenization of suspended cells in buffer containing 20mM 
HEPES; pH 7.4 and lOmMMgCU. Homogenization is performed on ice using a Brinkman 
10 Polytron™ for approximately 10 seconds. The resulting homogenate is centrifuged at 49,000 
X g for 1 5 minutes at 4 °C. The resulting pellet is then resusperided in buffer containing 
'20mM HEPES, pH 7.4 and 0.1 mM EDTA, homogenized for 10 seconds; followed by 
centrifugation at 49,000 X g for 15 minutes at The resulting pellet can be stored at - 
' 80°C until utilized. On the day of direct identification screening, the membrane pellet is 
15 slowly thawed at room temperature, resuspended in buffer containing 20mM HEPES, pH 7.4 
■ and lOmM MgCL2, to yield a' final protein concentration of 0.60mg/ml (the resuspended 
" membranes are placed on ice until use). , ' ' ' 

cAMP standards and Detection Buffer (comprising 2 >Ci of tracer [ ,25 I cAMP (100 
(A] to 11 ml Detection Buffer) are prepared and maintained in accordance with the 
20 manufacturer's instructions: Assay Buffer is prepared fresh for screening and contained 
20mM HEPES, pH 7.4, lOmM MgCl 2 , 20mM phospocreatine (Sigma), 0.1 units/ml creatine 
phosphokinase (Sigma), 50 GTP (Sigma), and 0.2 mM ATP (Sigma); Assay Buffer. can 
be stored on ice until utilized. 
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* Candidate compounds identified as per above (if frozen, thawed at room temperature) 
are added, preferably, to 96-well plate wells (3/J/well; 12/M final assay concentration), 
together with 40 Membrane Protein (30^g/well) and 50/^1 of Assay Buffer. This admixture 
is then incubated for 30 minutes at room temperature, with gentle shaking, — 

5 Following the incubation, 1 OOyul of Detection Buffer is added to each well, followed 

by incubation for 2-24 hours. Plates are then counted in a Wallac MicroBeta™ plate reader 
using "Prot #31" (as per manufacturer instructions). 

It is intended that each of the patents, applications, and printed publications mentioned 
in this patent document be hereby incorporated by reference in their entirety. 

10 > As, those skilled in the art will appreciate, numerous changes and modifications may 
be made to the preferred embodiments of the invention without departing from the spirit of 
the invention. It is intended that all such variations fall within the scope of the invention. 

Although a variety of expression vectors are available to those in the art, for 
purposes of utilization for both the endogenous and non-endogenous human GPCRs, it is 

15 most preferred that the vector utilized be pCMV. This vector was deposited with the 
. American Type Culture Collection (ATCC) on October 13, 1998 (10801 University Blvd., 
Manassas, V A 20 1 1 0-2209 USA) under the provisions of the Budapest Treaty for the 
International Recognition of the Deposit of Microorganisms for the Purpose of Patent 
Procedure. The DNA was tested by the ATCC and determined to be. The ATCC has 

20 assigned the following deposit number to pCMV: ATCC #203351 . . 
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What is claimed is: • - .' ■ " ' 

lV A cDNA encoding a^ 

G protein-coupled receptor comprising hARE-3(F3 1 3K). ± . ^- ; a, = . : . 
2. A non-endogenous version of a human G protein-coupled receptor encoded by the 

cDNA of claim 1. ' 
3 : A Plasmid comprising a Vector and the cDNA of claim 1. 

4. A Host Cell comprising the Plasmid of claim 3. • : ' 

5 . A cDNA encoding a non-endogenous, constitutively activated version of a human 
G protein-coupled receptor comprising hARE-4(V233K) , •. 

6. A non-endogenous version of a human G protein-coupied receptor encoded by the 
' cDNA of claim 5. 

7. A Plasmid comprising a Vector and the cDNA of claim 5. 

8. A Host Cell comprising the Plasmid of claim 7. 

9 A cDNA encoding a non-endogenous, constitutively activated version of a human 
G protein-coupled receptor comprising hARE-5(A240K). 
■ , 10: A non-endogenous version of a human G protein-coupled receptor encoded by the 

cDNA of claim 9. 

11. A Plasmid comprising a Vector and the cDN A of claim 5. 

12. A Host Cell comprising the Plasmid of claim 11. ; 

13. A cDNA encoding a non-endogenous, constitutively activated version of a human 
G protein-coupled receptor comprising hGPCR14(L257K). 
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14. A non-endogenous version of a human G protein-coupled receptor encoded by the 
cDNA of claim 13. 

15. A Plasmid comprising a Vector and the cDN A of claim 13 . -. 

16. A Host Cell comprising the Plasmid of claim 15. . . . 

17. A cDNA encoding a non-endogenous, constitutively activated version of a human 
G protein-coupled receptor comprising hGPCR27(C283K). 

1 8. A non-endogenous version of a human G protein-coupled receptor encoded by the 
cDNA of claim 17. 

1 9. A Plasmid comprising a Vector and the cDNA of claim 17.. 

20. A Host Cell comprising the Plasmid of claim 19. 

21. A cDNA encoding a non-endogenous, constitutively activated version of a human 
. G protein-coupled receptor comprising hARE-l(E232K). 

22. A non-endogenous version of a human G protein-coupled receptor encoded by the 
cDNA of claim 21. 

23. A Plasmid comprising a Vector and the cDNA of claim 21. ; . 

24. A Host Cell comprising the Plasmid of claim 23. 

25. A cDNA encoding a non-endogenous, constitutively activated version of a human 
G protein-coupled receptor comprising hARE-2(G285K). 

26. A non-endogenous version of a human G protein-coupled receptor encoded by the 
cDNA of claim 25. 

27. A Plasmid comprising a Vector and the cDNA of claim 25. : 

28. A Host Cell comprising the Plasmid of claim 27. , 
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29. A cDNA encoding a non-endogenous, constitutively activated version of a human t 
G protein-coupled receptor comprising hPPRl(L239K). 

30. A, non-endogenous versioivpf a human G protein-coupled receptor encoded by the 
cDNA bfclaim29. ; [ ~ /. y ■■■ . ; / / / . ^ -, 

31. A Plasmid comprising a Vector and the cDNA.of claim 29. .[ 

32. A Host Cell comprising the Plasmid of claim 31. 

33. A cDNA encoding a non-endogenous, constitutively activated version of a human 
- G protein-coupled receptor comprising hG2A(K232A). 

34. A non-endogenous version of a human G protein-coupled receptor encoded by the 
cDNA of claim 33. 

35. A Plasmid comprising a Vector and the cDNA of claim 33. 

36. A Host Cell comprising the Plasmid of claim 35. 

37. A cDNA encoding a non-endogenous, constitutively activated version of a human 
G protein-coupled receptor comprising hRUP3(L224K). 

38. A non-endogenous version of a human G protein-coupled receptor encoded by the 
cDNAofclaim37. [ 

39. A Plasmid comprising a Vector and the cDNA of claim 37. 

40. A Host Cell comprising the Plasmid of claim 39. 

41. A cDNA encoding a non-endogenous, constitutively activated version of a human 
G protein-coupled receptor comprising hRUP5(A236K); 

42. A non-endogenous version of a human G protein-coupled receptor encoded by the 
cDNAof claim 41. 

43. A Plasmid comprising a Vector and the cDNA of claim 41. 
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44. A Host Cell comprising the Plasmid of claim 42. 

45. A cDNA encoding a non-endogenous, constitutively activated version of a human 
G protein-coupled receptor comprising hRUP6(N267K) 

46. A non-endogenous version of a human G protein-coupled receptor encoded by the 
cDN A of claim 45. . , 

47. A Plasmid comprising a Vector and the cDNA of claim 45. .. 

48. A Host Cell comprising the Plasmid of claim 47. ^ . 

49. A cDNA encoding a non-endogenous, constitutively activated version of a human 
G protein-coupled receptor comprising hRUP7(A302K). 

50. A non-endogenous version of a human G protein-coupled receptor encoded by the 
cDNA of claim 49. 

5 1 . A Plasmid comprising a Vector and the cDNA of claim 49. 

52. A Host Cell comprising the Plasmid of claim 51. 

53. A cDNA encoding a non-endogenous, constitutively activated version of a human 
G protein-coupled receptor comprising hCHN4(V236K). 

54. A non-endogenous version of a human G protein-coupled receptor encoded by the 
cDNA of claim 53. 

55. A Plasmid comprising a Vector and the cDNA of claim 53. 

56. A Host Cell comprising the Plasmid of claim 55. , 

57. A cDNA encoding a non-endogenous, constitutively activated version of a human 
G protein-coupled receptor comprising hMC4(A244K). 

58. A non-endogenous version of a human G protein-coupled receptor encoded by the 
cDNA of claim 57. 
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; : 59 A Plasmid comprising a Vector and the cDNA of claim 57. 

* ; : : 60. A Host Cell comprising the Plasmid of claim 60. ; ; 

61 . A cDNA encoding a non-endogenous, constitutively activated version of a human 
: G protein-coupled receptor comprising hCHN3(S284K). 

5 v / 62. A non-endogenous version of a human G protein-coupled receptor encoded by the' 
./ ' • cDNA of claim 61. ;'. [ '". ~- •' 

■;.,'.*'■' 63. A Plasmid comprising a Vector and the cDNA of claim 61. 

64. A Host Cell comprising the Plasmid of claim 63. 

65. A cDNA encoding a non-endogenous, constitutively activated version of a human 
10 G protein-coupled receptor comprising hCHN6(L352K). 

66. A non-endogenous version of a human G protein-coupled receptor encoded by the 

; cDNA of claim 65. : ■• * - . ~ 

. 67. A Plasmid comprising a Vector and the cDNA of claim 65. 

68. A Host Cell comprising the Plasmid of claim 67. 
15' v 69. A cDNA encoding a non-endogenous, constitutively activated version of a human 
' \ T G protein-coupled receptor comprising hCHN8(N235K). ; 

70. A non-endogenous version of a human G protein-coupled receptor encoded by the ; ; 
cDNAofclaim69. 

71. A Plasmid comprising a Vector and the cDNA of claim 69. 
20 72. A Host Cell comprising the Plasmid of claim 71 . 

73. A cDNA encoding a non-endogenous, constitutively activated version of a human 
G protein-coupled receptor comprising hH9(F236K). 

74. A non-endogenous version of a human G protein-coupled receptor encoded by the 
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cDN A of claim 73. 

75. A Plasmid comprising a Vector and the cDNA of claim 73. 

76. A Host Cell comprising the Plasmid of claim 74. 

77. A cDNA encoding a non-endogenous, constitutively activated version of a human 
G protein-coupled ATI receptor selected from the group consisting of: 
hATl(F239K); hATl(Nl 1 1A); hATl(AT2K255IC3); and hATl(A243+). 

78. A non-endogenous version of a human G protein-coupled receptor encoded by a 
cDNA of claim 77. - r 

79. A Plasmid comprising a Vector and the cDNA of claim 77. - 

80. A Host Cell comprising the Plasmid of claim 79.., 

*************************** 
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. SEQUENCE * LISTING 



(1) f GENERAL INFORMATION: 



(i) APPLICANT: 



10 



Behan, Dominic P. 
Lehmann-Bruinsma, Karin 
~Chalme~rs, ~ Derek" T7 ~ ~ : 
Lowitz, Kevin P. . 
Lin, I -Lin : ' 
Dang, Huong T. 
Chen, Ruoping 
Liaw, Chen W. 
Gore, Martin J. 
white, Carol ; 



15 



(ii) TITLE OF INVENTION: , Non- Endogenous, Const itutively Activated Human G 
; Protein-Coupled Receptors 



(iii) NUMBER OF SEQUENCES : 146 • . 

• (iv) CORRESPONDENCE ADDRESS: . 

(A) ADDRESSEE: Arena Pharmaceuticals, Inc. 

20 (B) STREET: 6166 Nancy Ridge Drive 

(C) CITY: San Diego 

: / \ (D) STATE: CA 

(E) COUNTRY: USA \ 

(F) ZIP: 92121 ; > 



25 



(v) COMPUTER READABLE FORM: 

. (A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS /MS-DOS 

(D) SOFTWARE : Patentln Release #1 . 0 , . Version #1.30 



30 :■ (vi) CURRENT APPLICATION DATA: " 

(A) APPLICATION NUMBER: US ; 
; (B), FILING ; DATE : : ,-7 ; * ; 

: / \ • (C) CLASSIFICATION: 

(viii) ATTORNEY/AGENT INFORMATION: 
35 (A) NAME: Burgoon, Richard P. 

(B) REGISTRATION NUMBER: 34,787 

(ix) TELECOMMUNICATION INFORMATION: 
. . (A). TELEPHONE: (858)453-7200 
: <B) TELEFAX: (858)453-7210 



40 (2) INFORMATION FOR SEQ ID NO:l: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1260 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single : 
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(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: ; ' ,. 

ATGGTCTTCT CGGCAGTGTT GACTGCGTTC CATACCGGGA CATCCAACAC AACATTTGTC 60 

5 GTGTATGAAA ACACCTACAT GAATATTACA CTCCCTCCAC CATTCCAGCA TCCTGACCTC 120 

AGTCCATTGC TTAGATATAG TTTTGAAACC ATGGCTCCCA CTGGTTTGAG TTCCTTGACC 180 

GTGAATAGTA CAGCTGTGCC CACAACACCA GCAGCATTTA AGAGCCTAAA CTTGCCTCTT 240 

CAGATCACCC TTTCTGCTAT AATGATATTC ATTCTGTTTG TGTCTTTTCT TGGGAACTTG 300 

GTTGTTTGCC ' TCATGGTTTA CCAAAAAGCT GCCATGAGGT CTGCAATTAA CATCCTCCTT 360 

10 GCCAGCCTAG CTTTTGCAGA. CATGTTGCTT GCAGTGCTGA ACATGCCCTT TGCCCTGGTA - 420 

ACTATTCTTA CTACCCGATG GATTTTTGGG- AAATTCTTCT GTAGGGTATC TGCTATGTTT 480 

TTCTGGTTAT TTGTGATAGA AGGAGTAGCC ATCCTGCTCA TCATTAGCAT AGATAGGTTC — 540 

CTTATTATAG TCCAGAGGCA GGATAAGCTA AACCCATATA GAGCTAAGGT TCTGATTGCA 600 

GTTTCTTGGG CAACTTCCTT TTGTGTAGCT TTTCCTTTAG CCGTAGGAAA CCCCGACCTG 660- 

15 CAGATACCTT CCCGAGCTCC CCAGTGTGTG TTTGGGTACA CAACCAATCC AGGCTACCAG 720 

GCTTATGTGA TTTTGATTTC TCTCATTTCT TTCTTCATAC CCTTCCTGGT AATACTGTAC - 780 

TCATTTATGG GCATACTCAA CACCCTTCGG CACAATGCCT TGAGGATCCA TAGCTACCCT " 840 

GAAGGTATAT GCCTCAGCCA GGCCAGCAAA CTGGGTCTCA TGAGTCTGCA GAGACCTTTC 900 

CAGATGAGCA TTGACATGGG CTTTAAAACA CGTGCCTTCA CCACTATTTT GATTCTCTTT 960 

20 GCTGTCTTCA TTGTCTGCTG GGCCCCATTC ACCACTTACA GCCTTGTGGC AACATTCAGT 1020' 

AAGCACTTTT ACTATCAGCA CAACTTTTTT GAGATTAGCA CCTGGCTACT GTGGCTCTGC 1080 

TACCTCAAGT CTGCATTGAA TCCGCTGATC TACTACTGGA GGATTAAGAA ATTCCATGAT 1140 

GCTTGCCTGG ACATGATGCC TAAGTCCTTC AAGTTTTTGC CGCAGCTCCC TGGTCACACA 1200 

AAGCGACGGA TACGTCCTAG TGCTGTCTAT GTGTGTGGGG AACATCGGAC GGTGGTGTGA - 1260 

25 (3) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

„{A) LENGTH: 419 amino acids 

(B) TYPE:' amino acid 

(C) STRANDEDNESS : .single ' V 
30 (D) TOPOLOGY: linear,. 
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" . : ' : \ (il) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Met Val Phe Ser Ala Val Leu Thr Ala Phe His Thr * Gly Thr Ser Asn - 

~5 -7 ~Thr Th^ Met "AsrT lle^ Thr~Leu Pro 

..- V : V . ': 20 25 \- ... ./;' :' *30 \ 

* ; Pro Pro Phe Gin His Pro Asp Leu Ser Pro Leu- Leu Arg Tyr Ser Phe 

.' ' . 35 / '• 40 . ■ 45 . • " ' ■ 

Glu Thr Met Ala Pro Thr Gly Leu Ser Ser Leu Thr Val Asn Ser Thr 

10 50 ,; ;■' J- 55 . ' : 60 . ' 

Ala Val Pro Thr, Thr Pro Ala Ala Phe Lys Ser Leu Asn Leu Pro Leu 

' ■ 65 . . . • 70 ' : / ' ; 75 .' • ... v-.";' 80 

Gin lie Thr Leu Ser Ala lie Met lie Phe lie Leu Phe. Val Ser Phe 

• ' '*. " 85 , ■ 90 ~ 95 / 

15 Leu Gly Asn Leu Val Val Cys Leu Met Val Tyr Gin Lys Ala Ala Met \ 

100 105 ; 110 

Airg Ser Ala lie Asn lie Leu Leu Ala Ser Leu Ala Phe Ala Asp Met 
115 ' 120 125 

Leu Leu Ala Val Leu Asn Met Pro Phe Ala Leu Val Thr He Leu Thr 
20 130 135 v ; 140 ; 

-Thr Arg Trp He Phe Gly Lys Phe . Phe Cys t Arg Val Ser Ala Met Phe 
: 145 150 155 160 

. - ' Phe Trp Leu Phe Val lie Glu Gly Val Ala lie Leu Leu lie He Ser 

\ / - ' 165 ' , , .. .;/ 170 y]-:: 175 : 

25 He Asp Arg Phe Leu lie He Val Gin Arg Gin Asp Lys Leu Asn Pro 

■; 180 ' ; ; " 7' ' i85 ;>: .190 ■ < •• 

./ Tyr Arg Ala Lys Val Leu lie Ala Val Ser Trp Ala Thr Ser Phe Cys 
195 -200. 205 

Val Ala Phe Pro Leu Ala Val Gly Asn Pro Asp Leu Gin lie Pro Ser 
30 210 ' 215 220 . 

Arg Ala Pro Gin Cys Val Phe Gly Tyr. Thr Thr Asn Pro Gly Tyr Gin 
'225 230 235 240 

Ala Tyr Val lie Leu lie Ser Leu lie Ser Phe Phe lie Pro Phe Leu 
245 250 . : 255 

.35 ' Val He Leu Tyr Ser Phe Met Gly lie. Leu Asn Thr Leu Airg His Asn . 

'.}■/•■ ■ ■' ' V ''260' ••; : ^ ' " : ".'265'"" "I." .' .' . 270 
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Ala Leu Arg lie His Ser Tyr Pro Glu Gly lie Cys Leu Ser Gin Ala 
275 280 285 

Ser Lys Leu Gly Leu Met Ser Leu Gin Arg Pro Phe Gin Met Ser lie 
290 . . 295 300 

Asp Met Gly Phe Lys. Thr Arg Ala Phe Thr Thr lie Leu lie Leu Phe 
305 310 315 320 

Ala Val Phe lie Val Cys Trp Ala Pro Phe Thr Thr Tyr Ser Leu Val 
325 330 . 335 

Ala Thr Phe Ser Lys His Phe Tyr Tyr Gin His Asn Phe Phe Glu He 
340 345 350 

Ser Thr Trp Leu Leu Trp' Leu Cys Tyr Leu Lys Ser Ala Leu Asn Pro 
355 360 365 

Leu He Tyr Tyr Trp Arg lie Lys Lys Phe His Asp Ala Cys Leu Asp 
370 375 .. * 380 

Met Met Pro Lys Ser Phe Lys Phe Leu Pro Gin Leu Pro Gly His Thr 
385 . 390 395 400 

Lys Arg Arg He Arg Pro Ser Ala Val Tyr Val Cys Gly Glu His Arg 
405 410 415 

Thr Val Val 



(4) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS : 
. (A) LENGTH: 1119 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single' 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: .. 

ATGTTAGCCA ACAGCTCCTC AACCAACAGT TCTGTTCTCC CGTGTCCTGA CTACCGACCT.. 60 

ACCCACCGCC TGCACTTGGT GGTCTACAGC TTGGTGCTGG CTGCCGGGCT CCCCCTCAAC 120 

GCGCTAGCCC TCTGGGTCTT CCTGCGCGCG CTGCGCGTGC ACTCGGTGGT GAGCGTGTAC 180 

ATGTGTAACC TGGCGGCCAG CGACCTGCTC TTCACCCTCT CGCTGCCCGT TCGTCTCTCC 240 

TACTACGCAC TGCACCACTG GCCCTTCCCC GACCTCCTGT GCCAGACGAC GGGCGCCATC 300 

TTCCAGATGA ACATGTACGG CAGCTGCATC TTCCTGATGC TCATCAACGT GGACCG CTAC 360 
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GCCGCCATCG TGCACCCGCT ; GCGACTGCGC CACCTGCGGC GGCCCCGCGT GGCGCGGCTG 420 

CTCTGCCTGG ■ GCGTGTGGGC GCTCATCCTG GTGTTTGCCG TGCCCGCCGC CCGCGTGCAC . 4 80 

. AGGCCCTCGC GTTGCCGCTA CCGGGACCTC GAGGTGCGCC TATC5CTTCGA GAGCTTCAGC ' 540 

GACGAGCTGT GGAAAG_GCAG_ GCTGCTGCCC . CTCGTGCTGC TGGCCGAGGC GCTGGGCTTC- : -600 

GTGCTGCCCC TGGCGGCGGT GGTCTACTCG TCGGGCCGAG TCTTCTGGAC GCTGGCGCGC : 660 

CCCGACGCCA CGCAGAGCCA GCGGCGGCGG AAGACCGTGC GCCTCCTGCT GGCTAACCTC ; 720 

GTCATCTTCC TGCTGtGCTT CGTGCCCTAC AACAGCACGC TGGCGGTCTA CGGGCTGCTG v 780 

CGGAGCAAGC TGGTGGCGGC GAGGGTGCCT GCCCGCGATC GCGTGCGCGG GGTGCTGATG 840 

GTGATGGTGC TGCTGGCCGG CGCCAACTGC GTG'CTGGACC- CGCTGGTGTA CTACTTTAGC 900 

GCCGAGGGCT TCCGCAACAC CCTGCGCGGC CTGGGCACTC CGCACCGGGC CAGGACCTCG 960 

GCCACCAACG GGACGCGGGC GGCGCTCGCG CAATCCGAAA GGTCCGCCGT CACCACCGAC / 1020 

GCCACCAGGC CGGATGCCGC CAGTCAGGGG CTGCTCCGAC CCTCCGACTC CCACTCTCTG 1080 

TCTTCCTTCA CACAGTGTCC ' CCAGGATTCC GCCCTCTGA 1119 . 

(5) INFORMATION FOR SEQ ID NO: 4: - 

(i) SEQUENCE CHARACTERISTICS: . . 

(A) LENGTH: 372 amino, acids , : 
, (B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

; " - (ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:4: 

Met Leu Ala Asn Ser Ser .Ser Thr Asn Ser Ser ial Leu Pro Cys Pro 

. 1 . . . s . 10 ".. / - ■ . _ ,15 • ' ' ' 

Asp Tyr Arg Pro Thr His Arg Leu' His Leu Val val Tyr Ser Leu Val .\ 

r ■ - t : ; V.."." ' 20 • 25 ' 30" • -V r . 

Leu Ala Ala Gly Leu Pro Leu Asn Ala Leu Ala Leu Trp Val Phe Leu 

- 35 , " ,.. 40 ■ , . ; ' ■ /■ , 45, • 

Arg Ala Leu Arg Val His Ser Val Val Ser Val Tyr Met Cys Asn Leu 

* 50 55 • ' 60 

Ala Ala Ser Asp Leu Leu Phe Thr Leu Ser Leu Pro Val Arg Leu Ser. 
65 70 ( 75, 80 

Tyr Tyr Ala Leu His His Trp Pro Phe Pro Asp Leu Leu Cys Gin Thr 
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85 



90 



95 



Thr Gly Ala lie Phe Gin Met Asn Met Tyr Gly Ser Cys lie Phe Leu 
100 1 105 110 

Met Leu lie Asn Val Asp Arg Tyr Ala Ala lie Val His Pro Leu Arg 
■ 115 120 125 

Leu Arg . His Leu Arg Arg Pro Arg Val Ala Arg Leu Leu Cys Leu Gly 
130 135 140 



Val Trp Ala Leu He Leu Val Phe Ala Val Pro Ala Ala Arg Val His 
145 ... 150, 155 : 160 



10 



Arg Pro Ser Arg Cys Arg Tyr . Arg Asp Leu Glu Val Arg Leu Cys Phe 
165 170 _ 175 



Glu Ser Phe Ser Asp Glu Leu Trp Lys Gly Arg Leu Leu Pro Leu Val 
180 185 190 



15 



Leu Leu Ala Glu Ala Leu Gly Phe Leu Leu Pro Leu Ala Ala Val Val 
195 200 205 



Tyr Ser Ser Gly Arg Val Phe Trp Thr Leu Ala. Arg Pro Asp Ala Thr 
210 215. 220 

Gin Ser Gin Arg Arg Arg Lys Thr Val Arg Leu Leu Leu Ala Asn- Leu 
225 230 235 240 



20 



Val lie Phe Leu Leu Cys Phe Val Pro Tyr Asn Ser Thr Leu Ala Val 
245 . 250 255 



Tyr Gly Leu Leu Arg Ser Lys Leu Val Ala Ala . Ser Val Pro Ala Arg 
260 265 270 



25 



Asp Arg Val Arg Gly Val Leu Met Val Met Val Leu Leu Ala Gly Ala 
275 280 285 



Asn Cys Val Leu Asp Pro Leu Val Tyr Tyr* Phe Ser Ala Glu Gly Phe 
290 295 300 

Arg Asn Thr Leu Arg Gly Leu Gly Thr Pro His Arg Ala Arg Thr Ser 
305 310 315 320 



30 



Ala Thr Asn Gly Thr Arg Ala Ala Leu Ala Gin Ser Glu Arg Ser Ala 
325 330 . . / 335 



Val Thr Thr Asp Ala Thr Arg Pro Asp Ala Ala Ser Gin Gly Leu Leu 
340' 345 350 



35 



Arg Pro Ser Asp Ser His Ser Leu Ser Ser Phe Thr Gin Cys Pro Gin 
... 355 360 365 



Asp Ser Ala Leu 
370 
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- (6) INFORMATION FOR SEQ ID NO: 5: '\ \- -V" _ \ 

. . ( • ■ / ^. /(i) SEQUENCE CHARACTERISTICS:- / , , " 

*/i / -\ . (A) LENGTH: 1107 base pairs ' 

" . (B) TYPE: nucleic acid 

• • - ■- .5:... '. . (C) STRANDEDNESS : single ~ - - ^,~r~-— -~ ~- ~ - -~ - • --- • — ' ' - - -• 

... .' (D) TOPOLOGY: linear • : . / ' • . • 

- - • . y- ■ — / ( ii) , MOLECULE TYPE : DNA < (genomic) - ■•■ -..V-- \ ' ' - ;r ' ; : \ ; ' 



(xi ) SEQUENCE DESCRIPTION : SEQ ID NO : 5 : 







• ATGGCCAACT CCACAGGGCT GAACGCCTCA GAAGTCGCAG 


GL. 1 LG I TGGG 


. GTTGATCCTG 


60 




10 


GCAGCTGTCG 


TGGAGGTGGG 


GGCACTGCTG GGCAACGGCG 


Uot 1 bt. 1 GG 1 


CGTGGTGCTG 


120 




M,' ■■■■ 


CGCACGCCGG 


. GACTGCGCGA 


CGCGCTCTAC 


CTGGCGCACC 


TGTGCGTCGT 


bbAL GTGCTG 


180 






GCGGCCGCCT 


CCATCATGCC 


GCTGGGCCTG 


CTGGCCGCAC 


CGCCGCCCGG 


GCTGGGCCGC 


240 






GTGCGCCTGG GCCCCGCGCC ATGCCGCGCC 


GCTCGCTTCC 


TCTCCGCCGC 


i L. i GG A bLLb 


300 




■ '*"■.- . 


GCCTGCACGC 


TCGGGGTGGC 


CGCACTTGGC 


CTGGCACGCT 


ACCGCCTCAT 


* Lb I GGAGGGG 


. 360 




15 


CTGCGGCCAG 


GCTCGCGGCC 


GCCGCCTGTG 


- CTCGTGCTCA 


CCGCCGTGTG 


bbLLbLboLb 


420 






GGACTGCTGG 


GCGCGCTCTC 


CCTGCTCGGC 


CCGCCGCCCG 


CACCGCCCCC 


1 GV_ J. L L. 1 bL 1 


a a ft 






CGCTGCTCGG 


TCCTGGCTGG 


GGGCCTCGGG 


CCCTTCCGGC 


.CGCTCTGGGC 


CCTGCTGGCC : 


540 






TTCGCGCTGC 


CCGCCCTCCT 


GCTGCTCGGC 


GCCTACGGCG 


GCATCTTCGT 


GGTGGCGCGT 


600 






CGCGCTGCCC 


TGAGGCCCCC ACGGCCGGCG 


CGCGGGTCCC 


GACTCCGCTC 


GGACTCTCTG - 


660 




■ 20 


GATAGCCGCC 


TTTCCATCTT 


GCCGCCGCTC 


CGGCCTCGCC 


TGCCCGGGGG 


CAAGGCGGCC 


720 






CTGGCCCCAG CGCTGGCCGT GGGCCAATTT GCAGCCTGCT GGCTGCCTTA 


TGGCTGCGCG 


780 






TGCCTGGCGC 


CCGCAGCGCG 


GGCCGCGGAA 


GCCGAAGCGG 


CTGTCACCTG 


GGTCGCCTAC 


840 






TCGGCCTTCG 


CGGCTCACCC 


CTTCCTGTAC 


GGGCTGCTGC 


AGCGCCCCGT 


GCGCTTGGCA 


900 






CTGGGCCGCC 


TCTCTCGCCG 


TGCACTGCCT 


GGACCTGTGC 


GGGCCTGCAC 


TCCGCAAGCC 


960 




25 


TGGCACCCGC 


GGGCACTCTT 


GCAATGCCTC 


CAGAGACCCC 


CAGAGGGCCC 


TGCCGTAGGC 


1020 






CCTTCTGAGG 


CTCCAGAACA 


GACCCCCGAG 


TTGGCAGGAG. 


GGCGGAGCCC 


CGCATACCAG 


-1080 






GGGCCACCTG 


AGAGTTCTCT 


CTCCTGA 








1107 



(7) INFORMATION FOR SEQ ; ID NO :6: 

, <i) SEQUENCE CHARACTERISTICS: 
30 ; ; : " (A) LENGTH: ' 368 amino acids 
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(B) TYPE: amino acid . 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant . 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:6: 

Met Ala Asn Ser Thr Gly Leu Asn Ala Ser Glu Val Ala Gly Ser Leu 
15 10 is 

Gly Leu He Leu Ala" Ala Val Val Glu Val Gly Ala Leu Leu Gly Asn 
20 25 30 

Gly Ala Leu Leu Val Val Val Leu Arg Thr Pro Gly Leu Arg Asp Ala 
35 40 45 

Leu Tyr Leu Ala His Leu C/s Val Val Asp Leu Leu Ala Ala Ala Ser 
50 55 60 

lie Met Pro Leu Gly Leu Leu Ala Ala Pro Pro Pro Gly Leu Gly Arg 
65 70 75 80 

Val Arg Leu Gly Pro Ala Pro Cys Arg Ala Ala Arg Phe Leu Ser Ala 
85 90 95 . 

Ala Leu Leu Pro Ala Cys Thr Leu Gly Val Ala Ala Leu Gly Leu Ala 
100 105 no 

Arg Tyr Arg Leu He Val His Pro Leu Arg Pro Gly Ser Arg Pro Pro 
115 120 125 

Pro Val Leu Val Leu Thr Ala Val Trp Ala Ala Ala Gly Leu Leuj Gly 
130 135 140 

Ala Leu Ser Leu Leu Gly Pro Pro Pro Ala Pro Pro Pro Ala Pro Ala 
145 150 155 160 

Arg Cys Ser Val Leu Ala Gly Gly Leu Gly Pro Phe Arg Pro Leu Trp 
165 170 175 

Ala Leu Leu Ala Phe Ala Leu Pro Ala Leu Leu Leu Leu Gly Ala Tyr 
I 80 185 190 

Gly Gly He Phe Val Val Ala Arg Arg Ala Ala Leu Arg Pro Pro Arg 
. 195 200 205 

Pro Ala Arg Gly Ser Arg Leu Arg Ser Asp Ser Leu Asp Ser Arg Leu 
210 215 220 

Ser lie Leu Pro Pro Leu Arg Pro Arg Leu Pro. Gly Gly Lys Ala Ala 
225 230 235 240 

Leu Ala Pro Ala Leu Ala Val Gly Gin Phe Ala Ala Cys Trp Leu Pro- 
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\ : . : : " -9- ' ' . . ; V---' 

'V ■ 245 / 250 ' V 7 ^ • 255 

: ' Tyr Gly Cys Ala Cys Leu Ala Pro Ala Ala Arg Ala Ala Glu Ala Glu 

' / ■'//.. • ' ' • 260 . ■ , . 265 . v. ; 270 . ■ / ' 

Ala Ala Val. Thr Trp Val Ala Tyr Ser Ala Phe Ala Ala . His Pro Phe 
5 . f 275 280 285 : 

Leu Tyr Gly Leu Leu Gin Arg Pro. Val Arg Leu Ala Leu Gly Arg Leu 

' . ; \ 290 ' ; _ •' • 295 ' -/.. ' '] • 300 . . _ '.'■'..;'{ 

Ser Arg - Arg .Ala Leu Pro .Gly - Pro Val Arg Ala ,Cys, cThr Pro Glri Ala 

305 : : 310 : • 315 • 320 

10 : Trp His Pro Arg Ala Leu Leu Gin Cys Leu Gin Arg Pro Pro Glu Gly 

, ( : • ' 325 , . ' " ; 330 ... \ . > ■' ; 335 : , 

Pro Ala Val. Gly Pro Ser Glu Ala Pro Glu Gin Thr Pro Glu Leu Ala 

340 . " . ; 345 - '-. v \ '. - •: 350 

Gly Gly Arg Ser Pro Ala Tyr Gin Gly Pro Pro Glu Ser Ser Leu Ser 
15 355 / 360 • : 365 

(8) INFORMATION FOR SEQ ID N0:7: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : 1008 base pairs s 

(B) TYPE: nucleic acid ;\ 
20 ... (C) STRANDEDNESS : single 

(D)- TOPOLOGY: linear. ^ V. - 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ TD NO: 7 : 
ATGGAATCAT CTTTCTCATT TGGAGTGATC CTTGCTGTCC TGGCCTCCCT CATCATTGCT 

25 ACTAACACAC TAGTGGCTGT GGCTGTGCTG CTGTTGATCC. ACAAGAATGA TGGTGTCAGT 
CTCTGCTTCA CCTTGAATCT GGCTGTGGCT GACACCTTGA TTGGTGTGGC CATCTCTGGC 
CTACTCACAG ACCAGCTCTC CAGCCCTTCT CGGCCCACAC AGAAGACCCT GTGCAGCCTG 
CGGATGGCAT TTGTCACTTC CTCCGCAGCT GCCTCTGTCC TCACGGTCAT GCTGATCACC 
TTTGACAGGT ACCTTGCCAT CAAGCAGCCC TTCCGCTACT TGAAGATCAT GAGTGGGTTC 

30 GTGGCCGGGG CCTGCATTGC CGGGCTGTGG TTAGTGTCTT ACCTCATTGG CTTCCTCCCA 
CTCGGAATCC CCATGTTCCA GCAGACTGCC, TACAAAGGGC AGTGCAGCTT CTTTGCTGTA 
/ TTTCACCCTC ACTTCGTGCT GACCCTCTCC TGCGTTGGCT TCTTCCCAGC CATGCTCCTC r 
TTTGTCTTCT TCTACTGCGA CATGCTCAAG ATTGCCTCCA TGCACAGCCA GCAGATTCGA 



WO 00/22131 PCT/US99/24065 

-10- 

AAGATGGAAC ATGGAGGAGC CATGGCTGGA GGTTATCGAT CCCCACGGAC TCCCAGCGAC 660 
TTCAAAGCTC TCCGTACTGT GTCTGTTCTC ATTGGGAGCT TTGCTCTATC CTGGACGCCC 720 
TTCCTTATCA CTGGCATTGT GCAGGTGGCC TGCCAGGAGT GTCACCTCTA CCTAGTGCTG 780 
GAACGGTACC TGTGGCTGCT CGGCGTGGGC AACTCCCTGC TCAACCCACT GATCTATGCC 840 
TATTGGCAGA AGGAGGTGCG ACTGCAGCTC TACCACATGG CCCTAGGAGT GAAGAAGGTG 900 
CTCACCTCAT TCCTCCTCTT TCTCTCGGCC AGGAATTGTG GCCCAGAGAG GCCCAGGGAA 960 . 

AGTTCCTGTC ACATCGTCAC TATCTCCAGC TCAGAGTTTG ATGGCTAA 1008 
(9) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS : ' 

(A) LENGTH: 335 amino acids 

(B) TYPE: amino acid * 

(C) STRANDEDNESS: 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

Met Glu Ser Ser Phe Ser Phe Gly Val He Leu Ala Val Leu Ala Ser 
1 5 . 10 15 

Leu He He Ala Thr Asn Thr Leu Val Ala Val Ala Val Leu Leu Leu 
20 25 30 

He His Lys Asn Asp Gly Val Ser Leu Cys Phe Thr Leu Asn Leu Ala 
35 40 45 

Val Ala Asp Thr Leu lie Gly Val Ala lie" Ser Gly Leu Leu Thr Asp 
50 55 60 

Gin Leu Ser Ser Pro Ser Arg Pro Thr Gin Lys Thr Leu Cys Ser Leu 
65 70 75 80 

Arg Met Ala Phe Val Thr Ser . Ser Ala Ala Ala Ser Val Leu Thr Val 
85 90 95 

Met, Leu He Thr Phe Asp Arg Tyr Leu Ala He Lys Gin Pro Phe Arg 
100 105 no 

Tyr Leu Lys He Met Ser Gly Phe Val Ala Gly Ala Cys He Ala Gly 
115 120 125 

Leu .Trp Leu Val Ser Tyr Leu He Gly Phe Leu Pro Leu Gly lie Pro 
130 135 140 

Met Phe Gin Gin Thr Ala Tyr Lys Gly Gin Cys Ser Phe Phe Ala Val 
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• ■■■ ... - . v ■ ' -11- ' ' - ' ' ■:;.v-^'. ' 

> v: -.- . 145 "--:; "• • iso '• \, v/ - «5/ : . i« 

• ., Phe His Pro His' Phe Val Leu Thr Leu Ser Cys Val Gly Phe Phe Pro 



165. 



170 



! . Ala Met Leu Leu Phe Val Phe Phe Tyr .Cys Asp Met Leu Lys He Ala 

' ■ ' > ser Met His Ser Gin Gin lie .Arg Lys Met Glu His Ala Gly . Ala Met 
, r-.; - .195- 200 205 • 

Ala Gly Gly Tyr Arg Ser Pro Arg Thr . Pro Ser Asp Phe Lys Ala Leu 

,.. 210 - 215 .'. 220- 



10 



15 



Arg Thr Val Ser Val" Leu' lie Gly: Ser Phe Ala Leu Ser Trp Thr Pro 

225 •'■/; 230 235 * ' .. 2 

Phe Leu lie Thr Gly lie Val Gin Val Ala Cys Gin Glu Cys His Leu 

' 245 ' '. ' 250 255 

TVr Leu Val Leu Glu Arg Tyr Leu Trp Leu Leu Gly Val Gly Asn Ser 

. • V. . 260 ■ ■ 265 : ' [: [ 270 

Leu Leu Asn Pro Leu lie Tyr Ala Tyr Trp Gin Lys Glu Val Arg Leu 

■ -- ■ - 275 /' ' ' 280 285 . . / : ' 

Gin Leu Tyr His Met Ala Leu Gly Val Lys Lys Val Leu Thr Ser Phe 

290 . .'■ ■ 2 95 ■, 300 



20 Leu Leu 

305 



Phe Leu Ser Ala Arg Asn Cys Gly Pro Glu Arg Pro Arg Glu 

310 315 " ,'. . 320 



Ser Ser Cys His lie Val Thr lie Ser Ser Ser Glu Phe Asp Gly 

325 330 ... . 335 . 

(10) INFORMATION FOR SEQ ID NO:9: .. ... . . , ,' i 

25 (i) SEQUENCE CHARACTERISTICS: . ■ . 

(A) LENGTH: 1413 base pairs •/ ■; ■ 

;' / .' (B) TYPE: nucleic acid . • 

(C) STRANDEDNESS : single . • . 

(D) TOPOLOGY: linear • 

30 (ii) MOLECULE TYPE: DNA (genomic). . . . 

( X i) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

• ; ATGGACACTA CCATGGAAGC TGACCTGGGT GCCACTGGCC ACAGGCCCCG CACAGAGCTT /. . 60 

. gatgatgagg' ACTCCTACCC CCAAGGTGGC TGGGACACGG TCTTCCTGGT GGCCCTGCTG 120 

CTCCTTGGGC TGCCAGCCAA TGGGTTGATG GCGTGGCTGG CCGGCTCCCA GGCCCGGCAT 180 

35 GGAGCTGGCA CGCGTCTGGC GCTGCTCCTG CTCAGCCTGG CCCTCTCTGA CTTCTTGTTC 240 
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CTGGCAGCAG CGGCCTTCCA GATCCTAGAG ATCCGGCATG GGGGACACTG GCCGCTGGGG 300 

ACAGCTGCCT GCCG CTTCTA CTACTTCCTA TGGGGCGTGT. CCTAGTCCTC CGGCCTCTTC 360 

CTGCTGGCCG CCCTCAGCCT CGACCGCTGC CTGCTGGCGC TGTGCCCACA CTGGTACCCT 420. 

GGGCACCGCC CAGTCCGCCT GCCCCTCTGG GTCTGCGCCG GTGTCTGGGT GCTGGCCACA 480 

5 CTCTTCAGCG TGCCCTGGCT GGTCTTCCCC GAGGCTGCCG TCTGGTGGTA CGACCTGGTC 540 

ATCTGCCTGG ACTTCTGGGA CAGCGAGGAG CTGTCGCTGA GGATGCTGGA GGTCCTGGGG 600 

GGCTTCCTGC CTTTCCTCCT GCTGCTCGTC TGCCACGTGC TCACCCAGGC CACAGCCTGT 660 

GGCACCTGCC ACCGCCAACA GCAGCCCGCA GCCTGCCGGG GCTTCGCCCG TGTGGCCAGG 720 

ACCATTCTGT" GAGCCTATGT GGTCCTGAGG CTGCCCTACC AGCTGGCCCA GCTGCTCTAC 780 

10 CTGGCCTTCC TGTGGGACGT CTACTCTGGC TACCTGCTCT GGGAGGCCCT GGTCTACTCC 840 

GAGTACCTGA TCCTACTCAA CAGCTGCCTC AGCCCCTTCC TCTGCCTCAT GGCCAGTGCC 900 

GACCTCCGGA CCCTGCTGCG CTCCGTGCTC TCGTCCTTCG CGGCAGCTCT CTGCGAGGAG 960 

CGGCCGGGCA GCTTCACGCC CACTGAGCCA. CAGACCCAGC TAGATTCTGA GGGTCCAACT 1020 

CTGCCAGAGC CGATGGCAGA GGCCCAGTCA CAGATGGATC CTGTGGCCCA GCCTCAGGTG 1080 

15 AACCCCACAC TCCAGCCACG ATCGGATCCC ACAGCTCAGC CACAGCTGAA CCCTACGGCC 1140 

CAGCCACAGT CGGATCCCAC AGCCCAGCCA CAGCTGAACC TCATGGCCCA GCCACAGTCA "1200 

GATTCTGTGG CCCAGCCACA GGCAGACACT AACGTCCAGA CCCCTGCACC TGCTGCCAGT 1260 

* TCTGTGCCCA GTCCCTGTGA TGAAGCTTCC CCAACCCCAT CCTCGCATCC TACCCCAGGG 1320 

GCCCTTGAGG ACCCAGCCAC ACCTGCTGCC TCTGAAGGAG AAAGCCCCAG CAGCACCCCG 1380 

< . ... 

20 CCAGAGGCGG CCCCGGGCGC AGGCCCCACG TGA 1413 
(11) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 468 amino acids 

(B) TYPE: amino acid 
25 (C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 



(xi). SEQUENCE DESCRIPTION: ■ SEQ ID NO: 10: 

*'■ Met Asp Thr Thr Met Glu Ala Asp Leu Gly Ala Th'r Gly His Arg Pro 
30 1 . 5 10 15 , 
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Arg Thr Glu Leu Asp Asp Glu Asp Ser Tyr Pro Gin Gly Gly .Trp Asp 
20 25 30 

Thr Val Phe Leu Val Ala Leu Leu Leu Leu Gly Leu Pro Ala Asn Gly 
35 40 ■ 45 

5 Leu Met. Ala. Trp Leu Ala _Gly_ Ser Gln„ Ala Arg His Gly Ala Gly Thr^ 

50 55 , 60 

.'. Arg. Leu Ala Leu Leu Leu. Leu Ser. Leu. Ala; Leu Ser Asp. Phe Leu Phe 
65 70 ; 75 80 

Leu Ala Ala Ala Ala Phe Gin He Leu Glu lie Arg His Gly Gly His , 
10 85 90 95 

Trp Pro Leu Gly Thr Ala Ala Cys Arg Phe Tyr Tyr Phe Leu Trp Gly 
100 105 • 110 

Val Ser Tyr Ser Ser Gly Leu Phe Leu Leu Ala Ala Leu Ser Leu Asp 

; 115 ;,' 120 ;. . 125 

15 Arg Cys Leu Leu Ala Leu Cys Pro His Trp Tyr Pro Gly His Arg Pro 

130 135 140 

Val Arg Leu Pro Leu Trp Val Cys Ala Gly Val Trp Val Leu Ala Thr 
145 150 v 155 160 

Leu Phe Ser Val Pro Trp Leu Val Phe Pro Glu Ala Ala Val Trp Trp 
20 165 170 175 

Tyr Asp Leu Val He Cys Leu Asp Phe Trp Asp Ser Glu Glu .Leu Ser 
180 185 190 

Leu Arg Met Leu Glu Val Leu Gly Gly Phe Leu Pro Phe Leu Leu Leu 
195 200 205 

25 . Leu Val Cys His Val Leu Thr Gin Ala Thr Arg Thr Cys His Arg Gin 

210 215 . ; 220 

' Gin Gin Pro Ala Ala Cys Arg Gly Phe Ala Arg Val Ala Arg Thr lie 
225 230 235 240 

Leu Ser Ala Tyr Val Val Leu Arg Leu Pro Tyr Gin Leu Ala Gin Leu 
30 245 250 255 

Leu Tyr Leu Ala Phe Leu Trp Asp Val Tyr Ser Gly Tyr Leu Leu Trp 

260 - 265 - ..- .270 

Glu Ala Leu Val Tyr Ser Asp Tyr Leu He Leu Leu Asn Ser Cys Leu 
275 280 285 

35 Ser Pro Phe Leu Cys Leu Met Ala Ser Ala Asp Leu Arg Thr Leu Leu 

290 - . 295 - , ;,/' : 300 . 

Arg Ser Val Leu Ser Ser Phe Ala Ala Ala Leu Cys Glu Glu Arg Pro 
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305 310 315 320 

Gly Ser Phe Thr Pro Thr Glu Pro Gin Thr Gin Leu Asp Ser Glu Gly 
"325 330 335 

. Pro Thr Leu Pro Glu Pro Met Ala Glu Ala Gin Ser Gin Met Asp Pro 
5 340 345 350 

Val Ala Gin Pro Gin Val Asn Pro Thr Leu Gin Pro Arg Ser Asp Pro 
355 360 365 

Thr Ala Gin . Pro Gin Leu Asn Pro Thr Ala Gin Pro Gin Ser Asp Pro 
370 375 . 380, 

10 . Thr Ala Gin Pro Gin Leu Asn Leu Met Ala Gin Pro Gin Ser Asp Ser 

385 390 395 400 

Val Ala Gin Pro Gin Ala Asp Thr Asn Val Gin Thr Pro Ala Pro Ala 
405 410 415 

Ala Ser Ser Val Pro Ser Pro Cys Asp Glu Ala Ser Pro Thr Pro Ser 

15 420 425 430 

■ » .... . ' ■ • ■ ■ - ■ - ■ s ■ ■ 

Ser His Pro Thr Pro Gly Ala Leu Glu Asp Pro Ala Thr Pro Pro Ala 
435 440 445 

Ser Glu Gly Glu Ser Pro Ser Ser Thr Pro Pro Glu Ala Ala Pro Gly 
450 455 460 

20 Ala Gly Pro Thr 

465 

(12) INFORMATION FOR SEQ ID N0:11:, 

(i) SEQUENCE CHARACTERISTICS: " ■ - 
(A) LENGTH: 1248 base pairs 
25 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

30 ATGTCAGGGA TGGAAAAACT TCAGAATGCT TCCTGGATCT ACCAGCAGAA ACTAGAAGAT 60 

CCATTCCAGA AACACCTGAA CAGCACCGAG GAGTATCTGG CCTTCCTCTG CGGACCTCGG 120 

CGCAGCCACT ■ TCTTCCTCCC CGTGTCTGTG GTGTATGTGC CAATTTTTGT GGTGGGGGTC 180 

ATTGGCAATG TCCTGGTGTG CCTGGTGATT CTGCAGCACC AGGCTATGAA GACGCCCACC 24 0 

AACTACTACC TCTTCAGCCT GGCGGTCTCT GACCTCCTGG TCCTGCTCCT TGGAATGCCC 300 



■■■ • '' - ■ / ' v'-"> -15. ; A:. .. " • . .; .. . • 







CTGGAGGTCT 


ATGAGATGTG 


GCGCAACTAC 


CCTTTCTTGT TCGGGCCCGT 


GGGCTGCTAC 


360 






; TTCAAGACGG ' 


CCCTCTTTGA GACCGTGTGC 


TTCGCCTCCA TCCTCAGCAT 


CACCACCGTC 


420 






AGCGTGGAGC 


GCTACGTGGC 


CATCCTACAC 


ccgttccgcg ccaaactgca: GAGCACCCGG ' 


480 






CGCCGGGCCC 


TCAGGATCCT 


CGGCATCGTC 


TGGGGCTTCT CCGTGCTCTT 


CTCCCTGCCC 


540 




5 


AACACCAGCA 


TCCATGGCAT 


CAAGTTCCAC * 


TACTTCCCCA * ATGGGTCCCT GGTCCCAGGT 


600 






... TCGGCCACCT 


GTACGGTCAT 


CAAGCCCATG 


TGGATCTACA ATTTCATCAT 


CCAGGTCACC 


660 






TCCTTCCTAT 


TCTACCTCCT 


CCCCATGACT 


GTCATCAGTG TCCTCTACTA 


CCTCATGGCA 


720 






CTCAGACTAA 


AG AAAG AC AA 


ATCTCTTGAG 


GCAG ATGAAG GGAATGCAAA 


TATTCAAAGA 


780 






CCCTGCAGAA 


AATCAGTCAA 


CAAGATGCTG 


TTTGTCTTGG TCTTAGTGTT 


TGCTATCTGT 


840 . 




10 


TGGGCCCCGT 


TCCACATTGA 


CCGACTCTTC 


TTCAGCTTTG TGGAGGAGTG 


GAGTGAATCC 


900 






: CTGGCTGCTG 


TGTTCAACCT 


CGTCCATGTG 


GTGTCAGGTG TCTTCTTCTA 


CCTGAGCTCA 


960 






GCTGTCAACC 


CCATTATCTA 


TAACCTACTG 


TCTCGCCGCT TCCAGGCAGC 


ATTC CAGAAT 


1020 ( 






GTGATCTCTT 


CTTTCCACAA 


ACAGTGGCAC 


TCCCAGCATG ACCCACAGTT 


GCCACCTGCC 


1080 






CAGCGGAACA 


TCTTCCTGAC 


AGAATGCCAC 


TTTGTGGAGC. . TGACCGAAGA 


TATAGGTCCC 


1140 




15 


CAATTCCCAT 


GTCAGTCATC 


CATGCACAAC 


TCTCACCTCC CAACAGCCCT 


CTCTAGTGAA 


1200 






CAGATGTCAA GAACAAACTA TCAAAGCTTC 


CACTTTAACA AAACCTGA 




1248 



(13) INFORMATION FOR SEQ ID NO:12: 

(i) SEQUENCE CHARACTERISTICS: . . V 

: ; v' (A) LENGTH: "415 amino acids ; 

20 . (B) TYPE: amino acid . , ••• ;. 

- . (C) STRANDEDNESS : ; v; 

.(D) . TOPOLOGY : not relevant \- 

( ii) MOLECULE TYPE : protein ■ 

■ (xi) SEQUENCE . DESCRIPTION : . SEQ ID NO : 12 : 

25 Met Ser Gly Met Glu Lys Leu Gin Asn Ala Ser Trp lie Tyr Gin Gin 

l- ■. -5 - . 10 . . . . ' / 15 -. 

Lys Leu Glu Asp Pro Phe Gin Lys His Leu Asn Ser Thr Glu. Glu Tyr 

' ' ' . . 20 . . • 25 . '" . 30 

Leu Ala Phe Leu Cys Gly Pro Arg Arg Ser His Phe Phe Leu Pro Val 

30 i( ; , _ 35 " ^ J.; , 40 . ;. . 45 / 

Ser Val Val Tyr Val Pro lie Phe Val Val Gly Val lie Gly Aisn Val 
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50 , 55 , 60 . 

Leu Val Cys Leu Val lie Leu Gin Kis Gin. Ala Met Lys Thr Pro Thr 
65 70 75 80 

Asn Tyr Tyr Leu Phe Ser Leu Ala Val" Ser Asp Leu Leu Val Leu Leu 
.85 . . 90 95 

Leu Gly Met- Pro Leu Glu Val Tyr Glu Met Trp Arg Asn Tyr Pro Phe 
100 105 no 

Leu Phe Gly Pro Val Gly Cys Tyr Phe Lys Thr Ala Leu Phe' Glu Thr 
.115 . 120 * - 125 

Val Cys Phe Ala Ser r lie Leu , Ser.. lie Thr Thr Val Ser Val Glu Arg 
130 - 135 140 

Tyr Val Ala lie Leu His Pro Phe Arg Ala Lys Leu Gin Ser Thr Arg 
145 150 - 155 \ , 160 

Arg Arg Ala Leu Arg lie Leu Gly , He Val Trp Gly Phe Ser Val Leu 
165 170 175 

Phe Ser Leu Pro Asn Thr Ser He His Gly lie Lys Phe His Tyr Phe 
180 185 190 : 1 

Pro Asn Gly Ser Leu Val Pro Gly Ser Ala Thr Cys Thr Val lie Lys 
195 200 205 

Pro Met Trp He Tyr Asn Phe lie lie Gin Val Thr Ser Phe Leu Phe 
210 215 220 

Tyr Leu Leu Pro Met Thr Val He Ser Val Leu Tyr Tyr Leu Met Ala 
225 230 235 240 

Leu Arg Leu Lys Lys Asp Lys Ser Leu Glu Ala Asp Glu Gly Asn Ala 
245 250 ' 255 

Asn He Gin Arg Pro Cys Arg Lys Ser Val Asn Lys Met Leu Phe Val 
266 265 270 

Leu Val Leu Val Phe Ala He Cys Trp Ala Pro Phe His He Asp Arg 
275 280 285 

Leu Phe Phe Ser Phe Val Glu Glu Trp Ser Glu Ser Leu Ala Ala Val 
290 295 .300 

Phe Asn Leu Val His Val Val Ser Gly Val Phe Phe Tyr Leu Ser Ser 
305 310 315 320 

Ala Val Asn Pro He lie Tyr Asn Leu Leu Ser Arg Arg Phe Gin Ala 
325 330 335 

Ala Phe Gin Asn Val He Ser Ser Phe His Lys Gin Trp His Ser Gin 
340 345 .350 
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His Asp Pro Gin Leu Pro Pro Ala Gin Arg Asn lie Phe Leu Thr Glu . 
•355 . 360 . ' 365 ; 

Cys His Phe Vai Glu Leu Thr Glu Asp He Gly Pro Gin Phe .Pro Cys 
■ ~ ' ; 370. ' 375 380 • . *. . 

...5.:.... iGln Ser_Ser Met. His .Asn Ser His Leu_. Pro JThr >1 a _Leu Ser S^er Glu 
^ 385 390 : / 395 v 400 

.. - -.Gin Met Ser Arg -Thr. Asn Tyr Gin Ser .Phe His Phe Asn Lys Thr .... , ./ • 

405 \ " . ; 410 ' ," .415 /. 

(14) INFORMATION FOR SEQ ID NO: 13: - / ' 

10 (i) SEQUENCE CHARACTERISTICS: : : ; ". 

(A) LENGTH: 1173 base pairs 

(B) TYPE: nucleic acid . ' - . 
• . (C).. STRANDEDNESS : single ' . . •• - ; ' 

(D) TOPOLOGY: linear ;■• . r ; ; 

15 ( i i ) MOLECULE TYPE: DNA (genomic) . ; , ; 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO:13: 
ATGCCAGATA CTAATAGCAC AATCAATTTA TCACTAAGCA CTCGTGTTAC TTTAGCATTT ^ 60 

TTTATGTCCT TAGTAGCTTT TGCTATAATG CTAGGAAATG CTTTGGTCAT TTTAGCTTTT , 120 

GTGGTGGACA AAAACCTTAG ACATCGAAGT AGTTATTTTT TTCTTAACTT GGCCATCTCT : 180 

20 GACTTCTTTG TGGGTGTGAT CTCCATTCCT TTGTACATCC . CTCACACGCT GTTCGAATGG 240 

• GATTTTGGAA AGGAAATCTG TGTATTTTGG CTCACTACTG ACTATCTGTT ATGTACAGCA 300 

-TCTGTATATA ACATTGTCCT CATCAGCTAT GATCGATACC TGTCAGTCTC AAATGCTGTG 360 

TCTTATAGAA CTCAACATAC TGGGGTCTTG AAGATTGTTA - CTCTGATGGT GGCCGTTTGG ,420 

GTGCTGGCCT TCTTAGTGAA -TGGGCCAATG ATTCTAGTTT CAGAGTCTTG GAAGGATGAA : 480 

25 GGTAGTGAAT GTGAACCTGG ATTTTTTTCG GAATGGTACA TCCTTGCCAT CACATCATTC " 540 

• TTGGAATTCG TGATCCCAGT CATCTTAGTC GCTTATTTCA ACATGAATAT TTATTGGAGC 600 

CTGTGGAAGC GTGATCATCT. CAGTAGGTGC CAAAGCCATC CTGGACTGAC TGCTGTCTCT . 660 

' ■; TCCAACATCT GTGGACACTC ATTCAGAGGT AGACTATCTT CAAGGAGATC TCTTTCTGCA 720 

TCGACAGAAG TTCCTGCATC CTTTCATTCA GAGAGACAGA GGAGAAAGAG TAGTCTCATG 780 
30 TTTTCCTCAA GAACCAAGAT GAATAGCAAT ACAATTGCTT CCAAAATGGG TTCCTTCTCC v ,. 840 

; CAATCAGATT CTGTAGCTCT' TCACCAAAGG GAACATGTTG AACTGCTTAG AGCCAGGAGA 900 
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-18- , 

TTAGCCAAGT CACTGGCCAT TCTCTTAGGG GTTTTTGCTG TTTGCTGGGC TCCATATTCT 960 

CTGTTCACAA TTGTCCTTTC ATTTTATTGC TCAGCAAGAG GTCCTAAATC AGTTTGGTAT 1020 

AGAATTGCAT TTTGGCTTCA GTGGTTCAAT TCCTTTGTCA ATCCTCTTTT GTATGCATTG 1080 

TGTCACAAGC GCTTTCAAAA GGCTTTCTTG AAAATATTTT GTATAAAAAA GCAACCTCTA 1140 

5 CCATCACAAC ACAGTCGGTC AGTATCTTCT TAA - 1173 
(15) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: - . 

(A) ' LENGTH: 390 amino, acids 

(B) TYPE: amino acid 
10 (C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

Met Pro Asp Thr Asn Ser Thr lie Asn. Leu Ser Leu Ser fhr Arg Val 
15 1 5 . 10 15 

Thr Leu Ala Phe Phe Met Ser Leu Val Ala Phe Ala He Met Leu Gly 
20 25 ^ 30 

Asn Ala Leu Val He Leu Ala Phe Val Val Asp Lys Asn Leu Arg His 
35 40 45 

20 Arg Ser Ser Tyr Phe Phe Leu Asn Leu Ala He Ser Asp Phe Phe Val 

50 55 60 

Gly Val He Ser lie Pro Leu Tyr lie Pro His Thr Leu Phe Glu Trp 
65 70 75 80 

Asp Phe Gly Lys Glu He Cys Val Phe Trp Leu Thr Thr Asp Tyr Leu 
25 85 90 95 

Leu Cys Thr Ala Ser Val Tyr Asn lie Val Leu lie Ser Tyr, Asp Arg 
100 105 110 

Tyr Leu Ser Val Ser Asn Ala Val Ser Tyr Arg Thr Gin His Thr Gly 
115 120 125 

30 Val Leu Lys He Val Thr Leu Met Val Ala Val Trp Val Leu Ala Phe 

130 135 140 

Leu Val Asn Gly Pro Met He Leu Val Ser Glu Ser Trp Lys Asp Glu 
145 . 150 . 155 160 

Gly Ser Glu Cys Glu Pro Gly Phe Phe Ser Glu Trp Tyr He Leu Ala 
35 165 170 175 
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' • ne Thr Ser Phe Leu Glu Phe'Val lie Pro Val lie Leu Val Ala Tyr . 
V, 180 , .185 , 190 

Phe Asn Met Asn. He Tyr Trp Ser . Leu Trp Lys Arg Asp His Leu Ser 

195 200 205 , :: 

5 ' Arg Cys ' Gin Ser His ; Pro Gly Leu Thr Ala Val Ser Ser Asn lie Cys ' 

; - —210 ~ - - - - - 215 -^ rT - -- L , - — 220 — ... ^ ^ — - 

Gly His Ser Phe Arg Gly Arg Leu Ser Ser Arg Arg Ser Leu Ser Ala 

' 225 - - 230 - ---y — - — - 235-.-- T — >• - 240 : 

* Ser thr Glu Val Pro Ala Ser Phe His Ser Glu Arg Gin Arg Arg Lys 
10 245 250 V/ 255 

Ser Ser Leu Met Phe Ser. Ser Arg Thr Lys Met Asn Ser Asn Thr lie 
' 260 : ' 1 265 270 

; Ala Ser Lys Met Gly Ser Phe Ser Gin Ser Asp Ser Val Ala Leu His 
275 280 285 v : 

15 Gin Arg Glu His Val Glu Leu Leu Arg Ala Arg Arg Leu Ala Lys Ser 

290 295 300 

Leu Ala lie Leu Leu Gly Val Phe Ala Val Cys Trp Ala Pro' Tyr Ser 
= 305 310 ■■■ 315 320 

Leu Phe Thr lie Val Leu Ser Phe Tyr Ser Ser Ala Thr Gly Pro Lys 
20 325 330 ; - 335 

Ser Val Trp Tyr Arg lie Ala Phe Trp Leu Gin Trp Phe Asn Ser Phe 
340 - 345 . 350 

Val Asn Pro Leu Leu Tyr Pro Leu Cys His Lys Arg Phe Gin Lys Ala 
355 360 365 

25 Phe Leu Lys lie Phe Cys lie Lys Lys Gin Pro Leu Pro Ser Gin His 

. ' 370 375 .; : V ' \. ■ ■ 380 

Ser Arg Ser Val Ser Ser 
,385 " ' ■ '"; . .390 * v V " ; • ... / .\ 

(16) INFORMATION FOR SEQ ID , NO : 15 : 

30 , (i) SEQUENCE CHARACTERISTICS : \ ' * 

(A) LENGTH: 30 base pairs 

(B) TYPE : nucleic f acid , 

(C) STRANDEDNESS : single ' , 
: : (D) TOPOLOGY:, linear 

35 ! (ii) MOLECULE TYPE: DNA (genomic) 

(iy) ANTI-SENSE: NO N ; \ \ ; 

(xi) SEQUENCE DESCRIPTION: SEQ ID. NO: 15: 
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GGAAAGCTTA ACGATCCCCA GGAGCAACAT 30 

(17) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iv) ANTI-SENSE: YES 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16 : "'; 

CTGGGATCCT ACGAGAGCAT TTTTCACACA G 
31 

(18) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1128 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

ATGGCGAACG CGAGCGAGCC GGGTGGCAGC GGCGGCGGCG AGGCGGCCGC CCTGGGCCTC 60 

AAGCTGGCCA CGCTCAGCCT GCTGCTGTGC GTGAGCCTAG CGGGCAACGT GCTGTTCGCG 120 

CTGCTGATCG TGCGGGAGCG CAGCCTGCAC CGCGCCCCGT ACtACCTGCT . GCTCGACCTG 180 

TGCCTGGCCG ACGGGCTGCG CGCGCTCGCC TGCCTCCCGG CCGTCATGCT GGCGGCGCGG 240 

CGTGCGGCGG CCGCGGCGGG GGCGCCGCCG GGCGCGCTGG GCTGCAAGCT GCTCGCCTTC 300 

CTGGCCGCGC TCTTCTGCTT CCACGCCGCC TTCCTGCTGC TGGGCGTGGG CGTCACCCGC 360 

TACCTGGCCA TCGCGCACCA CCGCTTCTAT GCAGAGCGCC TGGCCGGCTG GCCGTGCGCC 420 

GCCATGCTGG TGTGCGCCGC CTGGGCGCTG GCGCTGGCCG CGGCCTTCCC GCCAGTGCTG 480 

GACGGCGGTG GCGACGACGA GGACGCGCCG TGCGCCCTGG AGCAGCGGCC ' CGACGGCGCC 540 

CCCGGCGCGC TGGGCTTCCT GCTGCTGCTG GCCGTGGTGG TGGGCGCCAC GCACCTCGTC 600 

TACCTCCGCC TGCTCTTCTT CATCCACGAC CGCCGCAAGA TGCGGCCCGC GCGCCTGGTG 660 



720 
.780 
840 
- 900 
960 
1020 
1080 
1128 
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CGCGCCGTCA- GCCACGACTG . GAGCTTCCAC GGCCCGGGCG CCACCGGCCA GGCGGCCGCC 
AACTGG^CGG CGGGCTTCGG CCGCGGGCCC ACGCCGCCCG CGCTTGTGGG CATCCGGCCC. 
GCAGGGCCGG GCCGCGGCGC GCGCCGCCTC CTCGTGCTGG AAGAATTCAA GACGG AGAAG 
AGGCTGTGCA ^^^^CTA CGCCGTCACG GTGCTCCTCC' TGCTCCTCTG GGGGCCCTAC - 
GTCGTGGCCA GCTACCTGCG GGTCGTGGTG CGGCCCGGCG CCGTCCCCCA GGCCTACCTG 
ACGGCCTCCG TGTGGCTGAC CTTCGCGCAG GCCGGCATCA ACCCCGTCGT GTGCTTCCTC 
TTCAACAGGG AGCTGAGGGA CTGCTTCAGG GCCGAGTTCC CCTGCTGCCA GAGCCCCCGG 
ACCACCCAGG CGACCCATCC CTGCGACCTG AAAGGCATTG GTTTATGA 
(19) INFORMATION FOR SEQ ID NO:18: v. ' 

J (i) SEQUENCE CHARACTERISTICS : \ .; . 

(A) LENGTH: 375 amino acids 

(B) TYPE: amino acid 
; - (C) STRANDEDNESS : r ; .. '.. > ' ; v : . 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein ' 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: , 

Met Ala Asn Ala Ser Glu Pro Gly Gly Ser Gly Gly Gly Glu Ala Ala 

,, ■■. 5 10 is 

;. Ala Leu Gly Leu Lys Leu Ala Thr Leu Ser Leu Leu Leu Cys Val Ser 

r ' ." ;: ' 20 - . 25 '. ■ v ■ V,"'. -\ 30./- . 

- Leu Ala Gly Asn Val Leu .Phe Ala Leu Leu lie Val Arg Glu Arg Ser 

■ ' 5 ■ "\ *° . \ : \ / . ' « . " 

Leu His Arg Ala Pro Tyr Tyr Leu Leu Leu Asp Leu Cys Leu Ala Asp 

v. 50 ' " . '.: 55 ' ' 60 ; 

Gly Leu Arg Ala Leu Ala Cys Leu Pro Ala Val Met Leu Ala! Ala Arg 

■ 65 > ■ ■ / 7 ° V 75 : \ 80 " 

Arg Ala Ala Ala Ala Ala Gly Ala Pro Pro Gly Ala Leu Gly Cys Lys 

• , 85 ' 90 ' . 95 •■■ 

Leu Leu Ala Phe Leu Ala Ala Leu. Phe Cys Phe His Ala Ala Phe Leu 
100- - :>;■■ •• ••• • . los _ h'q 

Leu Leu Gly Val Gly Val Thr Arg ' Tyr Leu' Ala lie Ala His His Arg • 



115 "0 • . 125 

. Phe Tyr Ala Glu Arg Leu Ala Gly Trp Pro Cys Ala Ala Met Leu Val 
130 . "5 140 
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Cys Ala Ala Trp Ala Leu Ala Leu Ala Ala Ala Phe Pro Pro Val Leu" 
145 150 155 160 

Asp Gly Gly Gly Asp Asp Glu Asp Ala Pro Cys Ala Leu Glu Gin Arg 
165 170 ~ 175 

5 Pro Asp Gly Ala Pro Gly Ala Leu Gly Phe Leu Leu Leu Leu Ala Val 

180 185 190 

Val Val Gly Ala Thr His Leu Val Tyr Leu Arg Leu Leu Phe Phe lie 
,195 200 205 

His Asp Arg Arg Lys Met Arg Pro Ala Arg Leu Val Pro Ala Val Ser 
10 210 215 220 

His Asp Trp Thr Phe His Giy Pro Gly Ala Thr Gly Gin Ala Ala Ala 
225 230 235 . . 240 

Asn Trp Thr Ala Gly Phe Gly Arg Gly Pro Thr Pro Pro Ala Leu Val 
245 250 255 

15 Gly lie Arg Pro Ala Gly Pro Gly Arg Gly Ala Arg Arg Leu Leu Val 

260 265 270 

Leu Glu Glu Phe Lys Thr Glu Lys Arg Leu Cys Lys Met Phe Tyr Ala 
275 280 285 

Val Thr Leu Leu Phe Leu Leu Leu Trp Gly Pro Tyr Val Val Ala Ser. 
20 290 295 300 



Tyr Leu Arg Val Leu Val Arg Pro Gly Ala Val Pro Gin Ala Tyr Leu 
305 310 315 320 

Thr Ala Ser Val Trp Leu Thr Phe Ala Gin Ala Gly He Asn Pro Val 
325 330 335 

25 Val Cys Phe Leu Phe Asn Arg Glu Leu Arg Asp Cys Phe Arg Ala Gin 

340 345 350 

Phe Pro Cys Cys Gin Ser Pro Arg Thr Thr Gin Ala Thr His Pro Cys 
355 360 365 

Asp Leu Lys Gly He Gly Leu 
30 370 375 

(20) INFORMATION FOR SEQ ID N0:19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1002 base pairs 

(B) TYPE: nucleic acid 

35 (C) STRANDEDNESS : single. 

AV) TOPOLOGY: linear 

(ii) MOLECULE. TYPE : DNA (genomic) 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

ATGAACACCA CAGTGATGCA AGGCTTCAAC AGATCTGAGC GGTGCCCCAG AGACACTCGG 60 

ATAGTACAGC TGGTATTCCC AGCCCTCTAC ACAGTGGTTT TCTTGACCGG CATCCTGCTG 120 
AATACTTTGG CTCTGTGGGT GTTTGTTCAC ATCCCCAGCT CCTCCACCTT CATCATCTAC . 180 

5 CTCAAAAACA CTTTGGTGGC CGACTTGATA ATGACACTCA TGCTTCCTTT CAAAATCCTC - 240 

TCTGACTCAC ACCTGGCACC CTGGCAGCTC AGAGCTTTTG TGTGTCGTTT TTCTTCGGTG 300 

ATATTTTATG AGACCATGTA TGTGGGCATC GTGCTGTTAG GGCTCATAGC CTTTGACAGA 360 

TTCCTCAAGA TCATCAGACC TTTGAGAAAT ATTTTTCTAA AAAAACCTGT TTTTGCAAAA 420 

ACGGTCTCAA TCTTCATCTG GTTCTTTTTG TTCTTCATCT CCCTGCCAAA TACGATCTTG 480 

10 AGCAACAAGG AAGCAACACC ATCGTCTGTG AAAAAGTGTG CTTCCTTAAA GGGGCCTCTG '540 

GGGCTGAAAT GGCATCAAAT GGTAAATAAC ATATGCCAGT TTATTTTCTG GACTGTTTTT 600 

ATCCTAATGC TTGTGTTTTA TGTGGTTATT GCAAAAAAAG TATATGATTC TTATAGAAAG .660 

TCCAAAAGTA AGGACAGAAA AAAC AAC AAA AAGCTGGAAG GCAAAGTATT TGTTGTCGTG 720 

GCTGTCTTCT TTGTGTGTTT TGCTCCATTT CATTTTGCCA GAGTTCCATA TACTCACAGT 780 

15 CAAACCAACA ATAAGACTGA CTGTAGACTG CAAAATCAAC TGTTTATTGC TAAAGAAACA 840 

ACTCTCTTTT TGGCAGCAAC TAACATTTGT ATGGATCCCT TAATATACAT ATTCTTATGT .900 

^ AAAAAATTCA CAGAAAAGCT ACCATGTATG CAAGGGAGAA AGACCACAGC ATCAAGCCAA 960 

GAAAATCATA GCAGTCAGAC AGACAACATA ACCTTAGGCT GA .1002 

(21) INFORMATION FOR SEQ ID NO:20: '{ 

20 (i) SEQUENCE CHARACTERISTICS : 

(A) ( LENGTH: 333 amino acids 
■(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

25 (ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: . ... 

Met Asn Thr Thr Val Met Gin Gly Phe Asn Arg Ser Glu Arg Cys Pro 
1 5 10 15 

Arg Asp Thr Arg lie Val Gin Leu Val Phe Pro Ala Leu Tyr Thr Val 
30 20 .25 30 
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Val Phe Leu Thr Gly lie Leu Leu Asn Thr Leu Ala Leu Trp Val Phe 
35 40 45 

Val His He Pro Ser Ser Ser Thr Phe He He Tyr Leu Lys Asn Thr 
50 55 60 

5 Leu Val, Ala Asp Leu He Met Thr Leu Met Leu Pro Phe Lys He Leu 

65 70 75 80 

Ser Asp Ser His Leu Ala Pro Trp Gin Leu Arg Ala Phe Val Cys Arg 
85 90 95 

Phe Ser Ser Val lie Phe Tyr Glu Thr Met Tyr Val Gly He Val Leu 
10 100 105 HO 

Leu Gly Leu He Ala Phe Asp Arg Phe Leu Lys He He Arg Pro Leu 
115 120 125 

Arg Asn lie Phe Leu Lys Lys Pro Val Phe Ala Lys Thr Val Ser He 
130 135 140 

15 Phe He Trp Phe Phe Leu Phe Phe lie Ser Leu Pro Asn Thr lie Leu 

145 150 155 " 160 . ; • 

Ser Asn Lys Glu Ala Thr Pro Ser Ser Val Lys Lys Cys Ala Ser Leu * ■ ]l 

165 , 170 . 175 * 

Lys Gly Pro Leu Gly Leu Lys Trp His Gin Met Val Asn Asn He Cys r ~ 
20 180 185 190 

Gin Phe He Phe Trp Thr Val Phe lie Leu Met Leu Val Phe Tyr Val 

195 200 205 . ^ 

Val He Ala Lys Lys Val Tyr Asp Ser Tyr Arg Lys Ser Lys Ser Lys 

210 215 220 ; 

25 Asp Arg Lys Asn Asn Lys Lys Leu Glu Gly Lys Val Phe Val Val Val 

225 230 235 " 240 

Ala Val Phe Phe Val Cys Phe Ala Pro Phe His Phe Ala . Arg Val Pro 
245 250 255 

Tyr Thr His Ser Gin Thr Asn Ash Lys Thr Asp Cys Arg Leu Gin Asn 
30 260 265 270 

Gin Leu Phe He Ala Lys Glu Thr Thr Leu Phe Leu Ala Ala Thr Asn 
275 280 285 

He Cys Met Asp Pro Leu He Tyr lie Phe Leu Cys Lys Lys Phe Thr 
290 295 300 

35 Glu Lys Leu Pro Cys Met Gin Gly Arg Lys Thr Thr Ala Ser Ser Gin 

305 310 315 320 



Glu Asn His Ser Ser Gin Thr Asp Asn He Thr Leu Gly 
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(22) INFORMATION FOR SEQ ID NO: 21: 




■" 


(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1122 base pairs . \\ . 

(B) TYPE: nucleic acid 

" : ~ " ^ " (C) STRANDEDNESS : single: ; ~, ;\/"/- ^ T - /~\~ : 
, . (D) TOPOLOGY: linear . : \ 

' (ii) MOLECULE TYPE :" DNA (genomic) - — - T — ' • 
. , ( X i) SEQUENCE DESCRIPTION: SEQ ID NO:21: . 




10 


ATGGCCAACA CTACCGGAGA GCCTGAGGAG GTGAGCGGCG CTCTGTCCCC ACCGTCCGCA 


.60 




TCAGCTTATG TGAAGCTGGT ACTGCTGGGA CTGATTATGT GCGTGAGCCT GGCGGGTAAC 


120 




GCCATCTTGT CCCTGCTGGT GCTCAAGGAG CGTGCCCTGC ACAAGGCTCC TTACTACTTC 


180 




CTGCTGGACC TGTGCCTGGC CGATGGCATA CGCTCTGCCG TCTGCTTCCC CTTTGTGCTG [ 


240 




GCTTCTGTGC GCCACGGCTC TTCATGGACC TTCAGTGCAC TCAGCTGCAA GATTGTGGCC 


300 


15 


TTTATGGCCG TGCTCTTTTG CTTCCATGCG GCCTTCATGC TGTTCTGCAT CAGCGTCACC 


360 




CGCTACATGG CCATCGCCCA CCACCGCTTC TACGCCAAGC GCATGACACT CTGGACATGC 


420 




GCGGCTGTCA TCTGCATGGC CTGGACCCTG TCTGTGGCCA TGGCCTTCCC ACCTGTCTTT / 


480 




GACGTGGGCA CCTACAAGTT TATTCGGGAG GAGGACCAGT GCATCTTTGA . GCATCGCTAC 


540 




TTCAAGGCCA ATGACACGCT GGGCTTCATG CTTATGTTGG CTGTGCTCAT GGCAGCTACC 


600 


20 


catgctgtct; ACGGCAAGCT GCTCCTCTTC GAGTATCGTC ACCGCAAGAT GAAGCCAGTG 


660 




• CAGATGGTGC ' CAGCCATCAG CCAGAACTGG ACATTCCATG GTCCCGGGGC CACCGGCCAG 


720 




GCTGCTGCCA ACTGGATCGC CGGCTTTGGC CGTGGGCCCA TGCCACCAAC CCTGCTGGGT 


780 




ATCCGGCAGA ATGdGCATGC AGCCAGCCGG CGGCTACTGG GCATGGACGA GGTCAAGGGT 


/ 840 




GAAAAGCAGC TGGGCCGCAT GTTCTACGCG ATCACACTGC TCTTTCTGCT CCTCTGGTCA 


900 


25 


CCCTACATCG TGGCCTGCTA CTGGCGAGTG TTTGTGAAAG CCTGTGCTGT GCCCCACCGC 


960 




' TACCTGGCCA CTGCTGTTTG GATGAGCTTC GCCCAGGCTG CCGTCAACCC AATTGTCTGC 


1020 




TTCCTGCTCA ACAAGGACCT CAAGAAGTGC CTGACCACTC ACGCCCCCTG CTGGGGCACA 


1080 




, GGAGGTGCCC CGGCTCCCAG AGAACCCTAC TGTGTCATGT GA , . 


1122 




(23) INFORMATION FOR SEQ ID NO:22: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 373 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(iij MOLECULE TYPE: DNA ^genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 

Met Ala Asn Thr Thridy Glu Pro Glu Glu Val Ser Gly Ala Leu Ser 
1 5 10 15 

Pro Pro Ser Ala Ser Ala Tyr Val Lys Leu Val Leu Leu Gly Leu He 
20 25 30 

Met Cys Val Ser Leu Ala Gly Asn Ala He Leu Ser Leu Leu Val Leu 
35 ' 40 45 

Lys Glu Arg Ala Leu His Lys Ala Pro Tyr Tyr Phe Leu Leu Asp Leu 
50 . 55 60 . 

Cys Leu Ala Asp Gly He Arg Ser Ala Val Cys Phe Pro Phe Val Leu 
65 70 75 80 

Ala Ser Val Arg His Gly Ser Ser Trp Thr Phe Ser Ala Leu Ser Cys 
85 90 f 95 

Lys He Val Ala . Phe Met Ala Val Leu Phe Cys Phe His Ala Ala Phe 
100 105 . 110 

Met Leu Phe Cys He Ser Val Thr Arg Tyr Met Ala He Ala His His 
115 120 125 

Arg Phe Tyr Ala Lys Arg Met Thr Leu Trp 
130 - k 135 ' 

Cys Met Ala Trp Thr Leu Ser Val Ala Met 
145 150 

Asp Val Gly Thr Tyr Lys Phe He Arg Glu 
165 170 

Glu His Arg Tyr Phe Lys Ala Asn Asp Thr 
180 185 

Leu Ala Val Leu Met Ala Ala Thr His Ala 
195 200 

Leu Phe Glu Tyr Arg His Arg Lys Met Lys 
210 215 

Ala He Ser Gin Asn Trp Thr Phe His Gly 
225 230 ' 



Thr Cys Ala. Ala Val, He 
140 

Ala Phe Pro Pro Val Phe 

155 160 

Glu Asp Gin Cys He Phe 
175 

Leu Gly Phe Met Leu Met 

Val Tyr Gly Lys Leu Leu 
'205 

Pro Val Gin Met Val Pro 

220, 

Pro Gly Ala Thr Gly Gin 

235' 240 



; ; ■ - ; -27- . . • - ; ; • ( . •■; ' :" .;V-/' . . 

■ Ala Ala Ala Asn Trp lie Ala Gly Phe Gly Arg Gly Pro Met Pro. Pro 

/ • 245 . . 250 v ; :. . . / 255 • ' ' ' ..; 

Thr Leu Leu Gly lie Arg Gin Asn Gly His Ala Ala Ser. Arg Arg Leu 

v ; - " 260 . •. ■ 265 .:... . -. 270 ... 

5 . Leu Gly Met Asp Glu Val Lys Gly Glu Lys Gin Leu Gly Arg Met Phe 
' ' ~ '275 " ~"T," ; 7 ;' ~" ~ ~ 180 ; " 7 - 7 TT285 ~ ~ ~ ~ - • 

. Tyr Ala lie Thr Leu ' Leu Phe Leu Leu Leu Trp Ser Pro Tyr lie. Val 

V 7 ' ■ ' : " ^ 290 7 295 " '""""'V: 7 300 ••" ' ". v """*'. : "^ 7' 

; \ . Ala Cys Tyr Trp Arg Val Phe Val Lys Ala. Cys Ala Val Pro His Arg • 

10 305 310 : : ' r 315 320 ; 

Tyr Leu Ala Thr Ala Val Trp. Met Ser Phe Ala Gin Ala Ala Val Asn 

' ' ' 325 ' / 330 335 . • : . . • 

Pro lie Val Cys Phe Leu Leu Asn Lys Asp Leu Lys Lys Cys Leu Thr 

' *•.""-".'■ 340 : • ,345 •■ - ; 350 -'" . v ■ , 

15 Thr His Ala Pro Cys Trp Gly Thr Gly Gly Ala Pro Ala Pro Arg Glu 

. 355 360 365 - , . 

Pro Tyr Cys Val Met ... . . 

• '• .370 .' . . . v - ' •• . • . * ' ' / '. ' '.. . 

(24) INFORMATION FOR SEQ ID NO: 23 : ! 

20 r.(i) SEQUENCE CHARACTERISTICS: "". 

' ' / . (A) LENGTH: 1053 base pairs ' 7v 

(B) TYPE: nucleic acid ■ - r 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear - . 
25 (ii) MOLECULE TYPE: DNA (genomic), .: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23 : •• \ ■ • 

ATGGCTTTGG AACAGAACCA GTCAACAGAT TATTATTATG AGGAAAATGA AATGAATGGC ,:. 60 

ACTTATGACT ACAGTCAATA TGAATTGATC. TGTATCAAAG AAGATGTCAG AGAATTTGCA , 120 . 

. AAAGTTTTCC TCCCTGTATT CCTCACAATA GCTTTCGTCA TTGGACTTGC AGGCAATTCC 180 • 

30 ATGGTAGTGG CAATTTATGC CTATTACAAG AAACAGAGAA CCAAAACAGA TGTGTACATC 240 

'•* CTGAATTTGG CTGTAGCAGA TTTACTCCTT CTATTCACTC TGCCTTTTTG GGCTGTTAAT 300 

' GCAGTTCATG GGTGGGTTTT AGGGAAAATA ATGTGCAAAA TAACTTCAGC CTTGTACACA 360 .< 

CTAAACTTTG TCTCTGGAAT GCAGTTTCTG GCTTGCATCA GCATAGACAG ATATGTGGCA 420 

V GTAACTAATG TCCCCAGCCA ATCAGGAGTG GGAAAACCAT GCTGGATCAT CTGTTTCTGT .480 



WO 00/22131 PCT7US99/24065 

. ' -28- 

GTCTGGATGG CTGCCATCTT GCTGAGCATA CCCCAGCTGG TTTTTTATAC AGTAAATGAC 540 

AATGCTAGGT GCATTCCCAT TTTCCCCCGC TACOTAGGAA CATCAATGAA AGCATTGATT- 600 

CAAATGCTAG AGATCTGCAT TGGATTTGTA GTACCCTTTC TTATTATGGG GGTGTGCTAC 660 

TTTATGACGG CAAGGACACT CATGAAGATG CCAAACATTA AAATATCTCG ACCCCTAAAA 720 

GTTCTGCTCA CAGTCGTTAT AGTTTTCATT GTCACTCAAG TGCCTTATAA CATTGTCAAG 780 

TTCTGCCGAG CCATAGACAT CATCTACTCC CTGATCACCA GCTGCAACAT GAGCAAACGC 840 

ATGGACATCG CCATCCAAGT CACAGAAAGC ATTGCACTCT. TTCACAGCTG CCTCAACCCA 900 

ATCCTTTATG TTTTTATGGG AGCATCTTTC AAAAACTACG TTATGAAAGT GGCCAAGAAA 960* 

TATGGGTCCT GGAGAAGACA GAGACAAAGT GTGGAGGAGT TTCCTTTTGA TTCTGAGGGT 1020 

CCTACAGAGC CAACCAGTAC TTTTAGCATT TAA 1053 
(25)- INFORMATION FOR SEQ ID NO:24: 

(i) SEQUENCE CHARACTERISTICS : . ^ * , 

(A) LENGTH: 350 amino acids 

(B) TYPE: amino acid 
■ (C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 

Met Ala Leu Glu Gin Asn Gin 
1 5 

Glu Met Asn Gly Thr Tyr Asp 
20 

Lys Glu Asp Val Arg Glu Phe 
35 

Thr He Ala Phe Val lie Gly 
50 55 

He Tyr Ala Tyr Tyr Lys Lys 
- 65 70 

Leu Asn Leu Ala Val Ala Asp 
85 

Trp Ala Val Asn Ala Val His 
100 ■ 

Lys He Thr Ser Ala Leu Tyr 



Ser Thr Asp Tyr Tyr Tyr Glu Glu Asn 
10 15 

Tyr Ser Gin Tyr Glu Leu He Cys He 
25 30 

Ala Lys Val Phe Leu Pro Val Phe Leu 
40 .45 

Leu Ala Gly Asn Ser Met Val Val Ala 
60 

Gin Arg Thr Lys Thr Asp Val Tyr He 
75 80 

Leu Leu Leu Leu Phe Thr Leu Pro Phe 
90 95 

Gly Trp Val Leu Gly Lys He Met Cys 
105 . 110 



Thr Leu Asn Phe Val Ser Gly Met Gin 
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■". ''115 ;/ ' • .•' ' . 120 . : . " : 125 

Phe Leu Ala Cys lie Ser lie Asp Arg' Tyr "Val Ala Val . Thr Asn Val 
;" ' . . . ■ 130' '.,, 135 • - .'. '■. '. . 140 

pro Ser Gin Ser Gly Val Gly Lys Pro Cys Trp lie lie Cys Phe Cys 

5 : ; us ■ -.J ~ : >5o ; - ; ; " ^ : ■ ;; i5s ^;y;;; 160 

•'■ Val Trp Met Ala Ala lie Leu Leu Ser lie Pro Gin Leu Val Phe Tyr 

;'..'•■•' .165 -.170 175 . 

Thr Val Asn Asp Asn Ala Arg Cys. lie Pro lie Phe Pro Arg Tyr Leu 

. 180 '. 185 " _' 190 

D Gly Thr Ser Met Lys Ala Leu lie Gin Met Leu Glu He Cys lie Gly 

- : 195 ... 200 • % ■'■ 205 :■ ■ 

Phe Val Val Pro Phe Leu He Met Gly Val Cys Tyr Phe lie Thr Ala 

' . ■■ 210 ' 215 \ '■■ ' :' 220 

• Arg Thr Leu Met Lys Met .Pro Asn He Lys lie Ser Arg Pro Leu Lys 
15 "•■ 225 .230 235 . . . ^ 240 

Val Leu Leu Thr Val Val He Val Phe He Val Thr Gin Leu Pro Tyr 
245 250 ' 255 

Asn He Var Lys Phe Cys Arg Ala He Asp lie He Tyr Ser Leu lie 
260 265 t .' , 270 

!0 : Thr Ser Cys Asn Met Ser Lys Arg Met Asp lie. Ala He Gin Val Thr 

275 280 285 

Glu Ser lie Ala Leu Phe His Ser Cys Leu Asn Pro lie Leu Tyr Val . 
290 295 300 

Phe Met Gly Ala Ser Phe Lys Asn Tyr Val Met Lys . Val Ala Lys Lys 
25 305 ' • 310 ' 315 ; - ' 320 

Tyr Gly Ser Trp Arg Arg Gin Arg Gin Ser Val Glu Glu Phe Pro Phe 
325 330 335 . 

Asp Ser Glu Gly Pro Thr Glu Pro Thr Ser Thr Phe Ser He 

'." 340 345 / ' ■ .,. 350 

30 (26) INFORMATION FOR SEQ ID NO:25: : , • 

' (i) SEQUENCE CHARACTERISTICS: • 

(A) LENGTH: . 1116 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

35 (D) TOPOLOGY: linear • / 

(ii) MOLECULE TYPE: . DNA (genomic) . . 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:25: 

ATGCCAGGAA ACGCCACCCC AGTGACCACC ACTGCCCCGT GGGCCTCCCT GGGCCTCTCC * 60 

GCCAAGACCT GCAACAACGT GTCCTTCGAA GAGAGCAGGA TAGTCCTGGT CGTGGTGTAC 120 

AGCGCGGTGT GCACGCTGGG GGTGCCGGCC AACTGCCTGA CTGCGTGGCT GGCGCTGCTG 180 

CAGGTACTGC AGGGCAACGT GCTGGCCGTC TACCTGCTCT GCCTGGCACT CTGCGAACTG 240 

CTGTACACAG GCACGCTGCC ACTCTGGGTC ATCTATATCC GCAACCAGCA CCGCTGGACC 300 

CTAGGCCTGC TGGCCTCGAA GGTGACCGCC TACATCTTCT TCTGCAACAT CTACGTCAGC 360 

ATCCTCTTCC TGTGCTGCAT CTCCTGCGAC CGCTTCGTGG CCGTGGTGTA CGCGCTGGAG 420 

AGTCGGGGCC GCCGCCGCCG GAGGACCGCC ATCCTCATCT CCGCCTGCAT CTTCATCCTC 480 

GTCGGGATCG TTCACTACCC GGTGTTCCAG ACGGAAGACA AGGAGACCTG CTTTGACATG 540 

CTGCAGATGG ACAGCAGGAT TGCCGGGTAC TACTACGCCA GGTTCACCGT TGGCTTTGCC 600 

ATCCCTCTCT CCATCATCGC CTTCACCAAC CACCGGATTT TCAGGAGCAT CAAGCAGAGC 660 

ATGGGCTTAA GCGCTGCCCA GAAGGCCAAG GTGAAGCACT CGGCCATCGC GGTGGTTGTC 720 

ATCTTCCTAG TCTGCTTCGC CCCGTACCAC CTGGTTCTCC TCGTCAAAGC CGCTGCCTTT 780 

TCCTACTACA GAGGAGACAG GAACGCCATG TGCGGCTTGG AGGAAAGGCT GTACACAGCC 840 

TCTGTGGTGT TTCTGTGCCT GTCCACGGTG AACGGCGTGG CTGACCCCAT TATCTACGTG 900 

CTGGCCACGG ACCATTCCCG CCAAGAAGTG TCCAGAATCC ATAAGGGGTG GAAAGAGTGG 960 

TCCATGAAGA CAGACGTCAC CAGGCTCACC CACAGCAGGG ACACCGAGGA GCTGCAGTCG 1020 

CCCGTGGCCC TTGCAGACCA CTACACCTTC TCCAGGCCCG TGCACCCACC AGGGTCACCA 1080 

TGCCCTGCAA AGAGGCTGAT TGAGGAGTCC TGCTGA 1116 
(28) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 371 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant , 

(ii) MOLECULE TYPE: protein. 

(xi) . SEQUENCE DESCRIPTION: SEQ ID NO: 26: 

Met Pro Gly Asn Ala Thr' Pro Val Thr Thf Thf Ala Pro Trp Ala Ser 
1 5 io is 
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/ . -31- • V . \. 

; Leu Gly Leu Ser Ala : Lys Thr v Cys Asri Asn. Val Ser Phe Glu, Glu Ser < 

\ . '": 20 25 .• 30 ; 

Arg lie Val Leu Val Val Val Tyr Ser Ala Val Cys Thr Leu. Gly Val 

-V'.. J ^ 35 :'" ^_ / V 40 ' . 45 

5 Pro Ala Asn Cys Leu Thr Ala Trp Leu Ala Leu Leu Gin Val Leu Gin ^ 

50 "~ . "/ ' ■ "55 " ~ " V '"" 60 ' ; — - — 

Gly Asn Val Leu Ala Val Tyr Leu Leu Cys Leu Ala Leu Cys Glu Leu 
65 ; 70 75 ; 80 

Leu Tyr Thr Gly Thr Leu Pro Leu Trp Val lie Tyr lie Arg Asn Gin ', 
10 "85 90 95 

His Arg Trp- Thr Leu Gly Leu Leu Ala Ser Lys Val Thr Ala Tyr lie 
100 : 105 / 110 

Phe Phe Cys Asn He Tyr Val Ser lie Leu Phe- Leu Cys Cys lie Ser 
115 - / 120 * 125 

15 Cys Asp Arg Phe Val Ala Val Val Tyr Ala Leu Glu Ser Arg Gly Arg 

130 135 140 

Arg Arg Arg Arg Thr Ala He Leu lie. Ser Ala Cys lie Phe lie Leu 

145 ,- : -: 150 155 160 

Val Gly lie Val His Tyr Pro Val Phe Gin Thr Glu Asp Lys Glu Thr 
20 . 165 170 175 

Cys Phe Asp Met Leu Gin Met Asp Ser Arg He Ala Gly Tyr Tyr Tyr 
180 185 , 190 

Ala Arg Phe Thr Val Gly Phe Ala lie Pro Leu Ser He He Ala Phe 

195 ... „. 200 ,, 205 , 

25 Thr Asn His Arg lie . Phe Arg . Ser He Lys Gin Ser Met Gly Leu Ser 

210 215 V 220 

Ala Ala Gin Lys Ala Lys Val Lys His Ser Ala lie Ala Val Val Val 
225 / \ 230. 235 240 

He Phe Leu Val Cys Phe Ala Pro Tyr , His Leu Val Leu , Leu Val Lys 
30 245 250 255 

Ala Ala Ala Phe Ser Tyr Tyr Afg Gly Asp Arg Asn Ala Met Cys Gly 
260 . 265 270 

Leu Glu Glu Arg Leu Tyr . Thr Ala Ser Val. Val Phe Leu Cys Leu Ser 
275 ' .280 285 

35 Thr Val Asn Gly Val Ala Asp Pro lie lie Tyr Val . Leu Ala Thr Asp 

290 295 300. , 



WO 00/22131 . PCT/US99/24065 

-32- 

His Ser Arg Gin Glu Val Ser.Arg lie His Lys Gly Trp Lys Glu Trp 
305 310 315 320 

Ser. Met Lys Thr Asp Val Thr Arg Leu Thr His Ser Arg Asp Thr Glu 
325 330 335 . 

5 Glu Leu Gin Ser Pro Val Ala Leu Ala Asp His Tyr Thr Phe Ser Arg 

340 345 350 

Pro Val His Pro Pro Gly Ser Pro Cys Pro Ala Lys Arg Leu lie Glu 
355 360 . 365 

Glu Ser Cys 
10 370 

(28) INFORMATION FOR SEQ ID NO:27: 

(i), SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1113 base pairs 

(B) TYPE : nucleic acid 
15 * (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:27: 
ATGGCGAACT ATAGCCATGC AGCTGACAAC ATTTTGCAAA ATCTCTCGCC 

20 TTTCTGAAAC TGACTTCCTT GGGTTTCATA ATAGGAGTCA GCGTGGTGGG 
ATCTCCATTT TGCTAGTGAA AGATAAGACC TTGCATAGAG CACCTTACTA 
GATCTTTGCT GTTCAGATAT CCTCAGATCT GCAATTTGTT TCCCATTTGT 
GTCAAAAATG GCTCTACCTG GACTTATGGG ACTCTGACTT GCAAAGTGAT 
GGGGTTTTGT CCTGTTTCCA CACTGCTTTC ATGCTCTTCT GCATCAGTGT 

25 TTAGCTATCG CCCATCACCG CTTCTATACA AAGAGGCTGA CCTTTTGGAC 
GTGATCTGTA TGGTGTGGAC TCTGTCTGTG GCCATGGCAT TTCCCCCGGT 
GGCACTTACT CATTCATTAG GGAGGAAGAT CAATGCACCT TCCAACACCG 
GCTAATGATT CCTTAGGATT TATGCTGCTT CTTGCTCTCA TCCTCCTAGC 
GTCTACCTCA AGCTGATATT TTTCGTCCAC GATCGAAGAA AAATGAAGCC 

30 GTAGCAGCAG TCAGCCAGAA CTGGACTTTT CATGGTCCTG GAGCCAGTGG 
GCCAATTGGC TAGCAGGATT TGGAAGGGGT CCCACACCAC CCACCTTGCT 
CAAAATGCAA ACACCACAGG CAGAAGAAGG CTATTGGTCT TAGACGAGTT 



TCTAACAGCC 60 

CAACCTCCTG 120 

CTTCCTGTTG 180 

GTTCAACTCT 240 

TGCCTTTCTG 300 

CACCAGATAC 360 

GTGTCTGGCT 420 

TTTAGACGTG 480 

CTCCTTCAGG 540 

CACACAGCTT 600 

AGTCCAGTTT 660 

CCAGGCAGCT 720 

GGGCATCAGG 780 

CAAAATGGAG 840 
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" .AAAAGAATCA GCAGAATGTT CTATATAATG ACTTTTCTGT TTCTAACCTT GTGGGGCCCC 900 " ■ 

TACCTGGTGG CCTGTTATTg' GAGAGTTTTT GCAAGAGGGC ' CTGTAGTACC AGGGGGATTT 9 60 

" ' ~CTAACAGCtG'CTC - 1020- - - 

TTCTCAAACA GGGAGCTGAG GCGCTGTTTC AGCACAACCC TTCTTTACTG CAGAAAATCC 1080 . 

5 AGGTTACCAA GGGAACCTTA' CTGTGTTATA TGA , 1113 

(29) INFORMATION FOR SEQ ID NO:28: 

(i) SEQUENCE CHARACTERISTICS: • ;• ' ' \ 

' (A) LENGTH: 370. amino, acids - 
-; . '-" ""(B) TYPE : amino acid ' 

10 (C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 



15 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:28: 

Met Ala Asn Tyr Ser His Ala Ala Asp Asn lie Leu Gin Asn Leu Ser 

Pro Leu Thr Ala Phe Leu Lys Leu Thr Ser Leu Gly Phe lie lie Gly 

■ 20 '., 25 : 30 

Val Ser Val Val Gly Asn Leu Leu lie Ser lie Leu Leu Val Lys Asp 

35 " 40 45 . 



20 



25 



30 



Lys. Thr Leu His Arg Ala Pro Tyr Tyr Phe Leu Leu Asp Leu Cys Cys 

, 50 . .:. ., ' 55 . 60 ' 

Ser Asp lie Leu Arg Ser Ala lie Cys Phe Pro Phe Val Phe Asn Ser 

65 ; V.-. ■: 70. . . . ; > 75 : , 80 . 

Val Lys Asn Gly Ser Thr Trp Thr Tyr Gly Thr. Leu Thr Cys Lys Val 

V 85 90 • 95 \ . 

lie Ala Phe Leu Gly Val Leu Ser Cys Phe His Thr Ala Phe Met Leu 
100 . 105 . HO 

Phe Cys lie Ser Val Thr Arg Tyr " Leu Ala lie Ala His His Arg Phe 



115 



120 



125 



• Tyr Thr Lys Arg Leu Thr Phe Trp Thr Cys Leu Ala Val lie Cys Met 
■ i30 ' ''■ 135- • ■ 140 

Val trp Thr Leu Ser Val Ala Met Ala Phe Pro Pro. Val Leu Asp Val 
145 150 - 155 160 

I GlylThr_ Tyr Ser, Phe _Ile_Arg_Glu Glu_ Asp_ Gln_ Cys _Th^r_phe J31n_ His^ _ ■ ^ 
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165 170. 175 

Arg Ser Phe Arg Ala Asn Asp Ser Leu Gly Phe Met Leu Leu Leu Ala 
180 185 190 

Leu lie Leu Leu Ala Thr Gin Leu Val Tyr Leu Lys Leu He Phe Phe 
5 195 200 205 

Val His Asp Arg Arg Lys Met Lys Pro'. Val /Gin Phe , Val Ala Ala Val 
210 215 220 

Ser Gin Asn Trp Thr Phe His Gly Pro Gly Ala Ser Gly Gin Ala Ala 
225 . 230 235 , 240 

10 _ Ala Asn Trp Leu Ala Gly Phe Gly Arg Gly Pro Thr Pro Pro Thr Leu 

245 250 - 255 

Leu Gly He Arg Gin Ash Ala Asn . Thr Thr Gly Arg Arg Arg Leu Leu 
260 265 270 

Val Leu Asp Glu Phe Lys Met Glu Lys Arg lie Ser Arg Met Phe Tyr 
15 275 . 280 285 

lie Met Thr Phe Leu Phe Leu Thr Leu Trp Gly Pro Tyr Leu Val Ala 
290 295 300 

Cys Tyr Trp Arg Val Phe Ala Arg Gly Pro Val Val Pro Gly Gly Phe; 
305 310 315 320 

20 Leu Thr Ala Ala Val Trp Met Ser Phe Ala Gin Ala Gly He Asn Pro 

325 330 335 

Phe Val Cys He Phe Ser Asn Arg Glu Leu Arg Arg Cys Phe Ser Thr 
340 345 " 350 

Thr Leu Leu Tyr Cys Arg Lys Ser Arg Leu Pro Arg Glu Pro Tyr Cys 
25 355 360 . 365 

Val lie • 
370 

(30) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 
30 (A) LENGTH: 1080 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single .' ' 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
35 (xi) SEQUENCE DESCRIPTION: , SEQ ID NO: 29: 



ATGCAGGTCC CGAACAGCAC CGGCCCGGAC AACGCGACGC TGCAGATGCT GCGGAACCCG 60 
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• : GCGATCGCGG TGGCCCTGCC CGTGGTGTAC • TCGCTGGTGG CGGCGGTCAG CATCCCGGGC 120 

AACCTCTTCT' CTCTGTGGGT GCTGTGCCGG CGCATGGGGC CCAGATCCCC GTCGGTCATC 180 

; TTCATGATCA ACCTGAGCGT CACGGACCTG ATGCTGGCCA GCGTGTTGCC TTTCCAAATC 240 

TACTACCATT GCAACCGCCA. CCACTGGGTA TTCGGGGTGC TGCTTTGCAA CGTGGTGACC 300 

5 GTGGCCTTTT ACGCAAACAT - GTATTCCAGC ATCCTCACCA TGACCTGTAT CAGCGTGGAG 360 

, . [. . . ... CGCTTCCTGG GGGTCCTGTA . CCCGCTCAGC TCCAAGCGCT . GGCGCCGCCG TCGTTACGCG„ . . .42.0 

GTGGCCGCGT GTGCAGGGAC CTGGCTGCTG CTCCTGACCG CCCTGTGCCC GCTGGCGCGC 480 

■ ACCGATCTCA. CCTACCCGGT GCACGCCCTG GGCATCATCA CCTGCTTCGA CGTCCTCAAG 540 

TGGACGATGC TCCCCAGCGT GGCCATGTGG GCCGTGTTCC TCTTCACCAT CTTCATCCTG 600 

10 CTGTTCCTCA TCCCGTTCGT GATCACCGTG GCTTGTTACA CGGCCACCAT CCTCAAGCTG 660; 

TTGCGCACGG AGGAGGCGCA CGGCCGGGAG CAGCGGAGGC GCGCGGTGGG CCTGGCCGCG ; 720 

; GTGGTCTTGC TGGCCTTTGT CACCTGCTTC GCCCCCAACA ACTTCGTGCT CCTGGCGCAC 780 

ATCGTG AG C C GCCTGTTCTA CGGCAAGAGC TACTACCACG TGTACAAGCT CACGCTGTGT 840 

/ CTCAGCTGCC TCAACAACTG TCTGGACCCG TTTGTTTATT ACTTTGCGTO CCGGGAATTC 900 

15 CAGCTGCGCC TGCGGGAATA TTTGGGCTGC CGCCGGGTGC ' C C AG AG AC AC CCTGGACACG .960. 

CGCCGCGAGA GCCTCTTCTC CGCCAGGACC ACGTCCGTGC GCTCCGAGGC ' CGGTGCGCAC 1020 

• CCTGAAGGGA TGGAGGGAGC CACCAGGCCC GGCCTCCAGA GGCAGGAGAG TGTGTTCTGA 1080 

(31) . INFORMATION FOR SEQ' ID NO:30: - V 

. (i) SEQUENCE CHARACTERISTICS: 
20 (A) LENGTH: 359 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : • ' . > v 
; (D) TOPOLOGY: not relevant ■ ' ' - . 

. (ii) MOLECULE TYPE: protein '*' '''/., '• 

25 . (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 

' Met . Gin Val Pro* Asn Ser Thr Gly Pro Asp Asn Ala Thr Leu Gin Met 

" ; '. 1 . V ; 5 '," ;\ .. ' 10 , • . 15 

Leu Arg Asn Pro Ala lie Ala Val Ala Leu Pro Val ' Val Tyr Ser Leu 

20 , . ' • 25 ■ \ ; ' , " , 30 : • . .. 

30 : . .Val Ala Ala Val Ser He Pro Gly Asn Leu Phe Ser Leu Trp ValLeu: 
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35 ^ 40 45 

Cys Arg Arg Met Gly Pro Arg Ser Pro' Ser Val lie Phe Met lie Asn 
50 " „ 55 60 

Leu Ser Val Thr Asp Leu Met Leu Ala Ser Val Leu Pro Phe Gin lie 
5 65 . 70 75 80 

Tyr Tyr His Cys Asn Arg His His Trp Val Phe Gly Val' Leu Leu Cys 
85 90 95 

Asn Val Val Thr Val Ala Phe Tyr Ala Asn Met Tyr Ser Ser lie Leu 

100 „ \ • 105 110 

10 : Thr Met Thr Cys lie Ser Val Glu Arg. Phe Leu Gly Val Leu Tyr Pro 
115 120 125 : 

Leu Ser Ser Lys Arg Trp Arg Arg Arg Arg Tyr Ala Val Ala Ala Cys 
130 135. 140 

; Ala Gly Thr Trp Leu Leu Leu Leu Thr Ala Leu Cys Pro. Leu Ala Arg 
15 . 145 150 155 160 

Thr Asp Leu Thr Tyr Pro Val His Ala Leu Gly lie lie Thr Cys Phe 
165 170 175. 

Asp Val Leu Lys Trp Thr Met. Leu Pro Ser Val Ala Met Trp Ala Val \ 
180 185 190 

20 Phe Leu Phe Thr lie Phe He Leu Leu Phe Leu lie Pro Phe Val lie 

195, 200 205 . ' 

Thr Val Ala Cys Tyr Thr Ala Thr He Leu Lys Leu Leu Arg Thr Glu ; 
210 215 220 

Glu Ala His Gly Arg Glu Gin Arg Arg Arg Ala Val Gly Leu Ala Ala 
25 225 230 235 240 

Val Val Leu Leu Ala Phe Val Thr Cys Phe Ala Pro Asn Asn Phe Val 
245 250 255 

Leu Leu Ala His He Val Ser Arg Leu Phe Tyr Gly Lys Ser Tyr Tyr 
260 265 270 

30 His Val Tyr Lys Leu Thr Leu Cys Leu Ser Cys Leu Asn Asn Cys Leu 

275 280 285 

Asp Pro Phe Val Tyr Tyr Phe Ala Ser Arg Glu Phe Gin Leu Arg Leu 
290 295 . 300 

Arg Glu Tyr Leu Gly Cys Arg Arg Val Pro Arg Asp Thr Leu Asp Thr 
35 305 310 315 320 



Arg Arg Glu Ser Leu Phe Ser Ala Arg Thr Thr Ser Val Arg Ser Glu 

< " 325 - , . ' 330 ' 335 
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: ; V • Ala Gly Ala His Pro Glu : Gly Met Glu Gly Ala Thr Arg Pro Gly Leu . 

V. \ "■ ' 340 345 ' „■ • 350 

' Gin Arg: Gin Glu Ser Val Phe 

355 •' : . '• ' - " • ' .'• ' - . ^ 

~ ^5 '~" 7 T32 V 'iNFO^^TIOK FOR SEQ ID NO : 3 1: " 7 £ " ■ ^ " ■-■--^ 

..: 1 ::l '*■'! (A V sequence cht^cteristics • . f ' _ £ i . ; : _ : ^ ^ V - - • • ,V,,.„„,^. _ ^ 

(A) LENGTH: 1503 base pairs 

v ; (B) TYPE: nucleic acid 

(C) STRANDEDNESS : . single , 

. 10 (D) TOPOLOGY: linear . ; 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31 : 





ATGGAGCGTC 


CCTGGGAGGA 


CAGCCCAGGC 


CCGGAGGGGG 


CAGCTGAGGG 


CTCGCCTGTG 


60 




CCAGTCGCCG 


CCGGGGCGCG 


CTCCGGTGCC 


GCGGCGAGTG 


GCACAGGCTG GCAGCCATGG 


120 


15 


. GCTGAGTGCC 


CGGGACCCAA 


GGGGAGGGGG 


CAACTGCTGG 


CGACCGCCGG 


CCCTTTGCGT 


180 


- 


CGCTGGCCCG 


CCCCCTCGCC 


TGCCAGCTCC 


AGCCCCGCCC 


CCGGAGCGGC 


GTCCGCTCAC 


240 




TCGGTTCAAG 


GCAGCGCGAC 


TGCGGGTGGC 


oLALbACCAG 


GGCG CAGACC 


TTGGGGCGCG 


300. 




CGGCCCATGG 


AGTCGGGGCT 


GCTGCGGCCG 


GCGCCGGTGA 


gcgaggtcat: 


CGTCCTGCAT 


360 




TACAACTACA 


CCGGCAAGCT 


CCGCGGTGCG 


AGCTACCAGC 


CGGGTGCCGG 


CCTGCGCGCC . 


420 


20 


GACGCCGTGG 


TGTGCCTGGC 


GGTGTGCGCC 


TTCATCGTGC 


TAGAGAATCT 


AGCCGTGTTG 


480 




TTGGTGCTCG 


GACGCCACCC 


GCGCTTCCAC 


GCTCCCATGT 


TCCTGCTCCT 


GGGCAGCCTC 


540 




ACGTTGTCGG 


ATCTGCTGGC AGGCGCCGCC 


TACGCCGCCA 


ACATCCTACT 


GTCGGGGCCG 


600 




CTCACGCTGA 


AACTGTCCCC 


CGCGCTCTGG 


TTCGCACGGG 


AGGGAGGCGT 


CTTCGTGGCA 


660 




CTCACTGCGT 


CCGTGCTGAG 


CCTCCTGGCC 


ATCGCGCTGG 


AGCGCAGCCT 


CACCATGGCG 


720 


25 


CGCAGGGGGC 


CCGCGCCCGT 


CTCCAGTCGG 


GGGCGCACGC 


TGGCGATGGC 


AGCCGCGGCC 


780 




TGGGGCGTGT 


CGCTGCTCCT 


CGGGCTCCTG 


CCAGCGCTGG 


GCTGGAATTG 


CCTGGGTCGC 


840 




CTGGACGCTT 


GCTCCACTGT 


CTTGCCGCTC 


TACGCCAAGG 


CCTACGTGCT 


CTTCTGCGTG 


900 




CTCGCCTTCG 


TGGGCATCCT. 


GGCCGCGATC 


TGTGCACTCT 


ACGCGCGCAT 


CTACTGCCAG 


960 




GTACGCGCCA 


ACGCGCGGCG 


CCTGCCGGCA 


CGGCCCGGGA 


CTGCGGGGAC 


CACCTCGACC 


1020 


30 


CGGGCGCGTC GCAAGCCGCG ' CTCTCTGGCC 


TTGCTGCGCA 


CGCTCAGCGT 


GGTGCTCCTG 


.1080 
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GCCTTTGTGG CATGTTGGGG CCCCCTCTTC CTGCTGCTGT TGCTCGACGT GGCGTGCCCG 1140 

GCGCGCACCT GTCCTGTACT CCTGCAGGCC GATCCCTTCC TGGGACTGGC CATGGCCAAC 1200 

TCACTTCTGA ACCCCATCAT CTACACGCTC ACCAACCGCG ACCTGCGCCA CGCGCTCCTG 1260 

CGCCTGGTCT GCTGCGGACG CCACTCCTGC GGCAGAGACC CGAGTGGCTC CCAGCAGTCG 1320 

5 GCGAGCGCGG CTGAGGCTTC CGGGGGCCTG CGCCGCTGCC: TGCCCCCGGG CCTTGATGGG 1380 

AGCTTCAGCG GCTCGGAGCG CTCATCGCCC CAGCGCGACG GGCTGGAGAC CAGCGGCTCC 1440 

ACAGGCAGCC CCGGTGCACC CACAGCCGCC CGGACTCTGG TATCAGAACC GGCTGCAGAC 1500 

TCA ; , • 1503 

(33) INFORMATION FOR SEQ ID NO:32: 

10 - (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 500 amino acids - . 

(B) TYPE: amino acid 

: " (C) STRAND EDNESS : • % 

(D) TOPOLOGY: not relevant 

15 (ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:32: 

Met Glu Arg Pro Trp Glu Asp Ser Pro Gly Pro Glu Gly Ala Ala Glu 
1 5 10 15 

Gly Ser Pro Val Pro Val Ala Ala Gly Ala Arg Ser Gly Ala Ala Ala 
20 20 25 30 

Ser Gly Thr Gly Trp Gin Pro Trp Ala Glu Cys Pro Gly Pro Lys Gly 
35 40 45 

Arg Gly Gin Leu Leu Ala Thr Ala Gly Pro Leu Arg Arg Trp Pro Ala 
50 55 • 60 

25. . Pro Ser Pro Ala Ser Ser Ser Pro Ala Pro Gly Ala Ala Ser- Ala His 

65 70 75 80 

Ser Val Gin Gly Ser Ala Thr Ala Gly Gly Ala Arg Pro Gly Arg Arg 
85 - 90 . . 95 

Pro Trp Gly Ala Arg Pro Met Glu Ser Gly Leu Leu Arg Pro Ala Pro 
30 100 105 110 

Val Ser Glu Val lie Val Leu His Tyr Asn Tyr Thr Gly Lys Leu Arg 
115 . 120 125 

Gly Ala Ser Tyr Gin Pro Gly Ala Gly Leu Arg Ala Asp Ala Val Val 
130 135 140 



Cys Leu Ala Val Cys Ala Phe. lie Val Leu Glu Asn Leu Ala Val Leu 
: .- 14 ; 5 . 150 . 155 " 160 

Leu Val Leu Gly Arg His -Pro Arg; Phe His Ala Pro Met Phe Leu Leu' 
165 '•' 170 175 ■ 

_ Leu "Gly Ser Leu Thr" Leu" Ser~Asp _ Leu"'Leu T Aia. Gly Ala" Ala~Tyr Ala" 

180 . ms. 190 , •■' - 

"Ala Asn lie Leu Leu Ser Gly Pro Leu Thr Leu Lys Leu Ser Pro Ala 
195 200 ; 205 

Leu Tip Phe Ala Arg Glu Gly Gly Val Phe Val Ala Leu Thr Ala Ser 

; ■ 210 , 215 ;. . ' . 220 : ' 

Val Leu Ser Leu Leu Ala lie Ala. Leu Glu Arg Ser Leu Thr Met Ala 
225 230. ; 235 ■ ;v , 240 

Arg Arg Gly Pro Ala Pro Val Ser Ser Arg Gly "Arg Thr Leu Ala Met 

245 " ;: ' V ' ^.\. 2S6 : 255 

Ala Ala Ala Ala Trp Gly Val Ser Leu Leu Leu Gly -Leu Leu Pro Ala 
260 " .265.; 270 ■ . 

Leu Gly . Trp Asn Cys Leu Qly Arg Leu. Asp Ala Cys Ser Thr Val Leu 

: 275 . . 280 . . .. 285 ' 

Pro Leu Tyr Ala Lys Ala Tyr Val Leu Phe Cys Val Leu Ala Phe Val 

. : 290 . . . 295 300 

Gly lie Leu Ala Ala He Cys Ala Leu Tyr Ala Arg lie Tyr Cys Gin 
305 310 . V 315 320 

Val Arg Ala Asn Ala Arg Arg Leu Pro Ala Arg Pro Gly Thr Ala Gly 

■ :■ 325 ' ,. - * /■ 330'; .. 335 

Thr Thr Ser Thr Arg Ala. Arg Arg Lys Pro. Arg Ser Leu Ala Leu Leu 

340 ' • " . : ' 345 . ; ., 350 • 

Arg Thr Leu Ser Val Val Leu Leu Ala Phe Val Ala Cys Trp Gly Pro 

' 355 , . '. 360 ' - : • ■ ' 365. . 

Leu Phe . Leu Leu Leu Leu Leu Asp Val Ala Cys' Pro Ala Arg Thr Cys 

• 370 '■ ' 375 ' '380, 



Pro Val Leu Leu Gin Ala Asp Pro Phe Leu Gly Leu Ala Met Ala Ash 
385 390 - 395 400 

Ser Leu Leu Asn Pro He He Tyr Thr Leu Thr Asn Arg Asp , Leu Arg 



405 



410 



415 



His Ala Leu Leu Arg Leu Val Cys Cys Gly Arg' His Ser Cys Gly Arg 

.. • ..- ; .-. . v . 420 .. ; ; . •;. . .-...425 ■ " 430 ' • - * • 

Asp Pro Ser Gly Ser Gin Gin Ser Ala Ser Ala Ala Glu Ala Ser Gly 
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435 440 ; v r 445 

Gly Leu Arg Arg Cys Leu Pro Pro Gly Leu Asp Gly Ser Phe, Ser. Gly 
450 455 460 

Ser Glu Arg Ser Ser Pro Gin Arg Asp Gly Leu Asp Thr Ser Gly Ser 
5 465 470 475 480 

Thr Gly Ser Pro Gly Ala Pro Thr Ala Ala Arg Thr Leu Val Ser Glu 
485 490 495 

Pro Ala Ala Asp 
500 

10 (34) INFORMATION FOR SEQ ID NO: 33 : 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1029 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
15 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:33: 

ATGCAAGCCG TCGACAATCT CACCTCTGCG CCTGGGAACA CCAGTCTGTG CACCAGAGAC 60 

TACAAAATCA CCCAGGTCCT CTTCCCACTG CTCTACACTG TCCTGTTTTT TGTTGGACTT 120 

20 ATCACAAATG GCCTGGCGAT GAGGATTTTC TTTCAAATCC GGAGTAAATC AAACTTTATT 180 

ATTTTTCTTA AGAACACAGT CATTTCTGAT CTTCTCATGA TTCTGACTTT TCCATTCAAA 240 
ATTCTTAGTG ATGCCAAACT GGGAACAGGA CCACTGAGAA CTTTTGTGTG TCAAGTTACC - 300 
TCCGTCATAT TTTATTTCAC AATGTATATC AGTATTTCAT TCCTGGGACT GATAACTATC * 360 

GATCGCTACC AGAAGACCAC CAGGCCATTT AAAACATCCA ACCCCAAAAA TCTCTTGGGG 420 

25 GCTAAGATTC TCTCTGTTGT CATCTGGGCA TTCATGTTCT TACTCTCTTT GCCTAACATG 480 

ATTCTGACCA ACAGGCAGCC GAGAGACAAG AATGTGAAGA AATGCTCTTT CCTTAAATCA 540 

GAGTTCGGTC TAGTCTGGCA TGAAATAGTA AATTACATCT GTCAAGTCAT, TTTCTGGATT 600 

AATTTCTTAA TTGTTATTGT ATGTTATACA CTCATTACAA AAGAACTGTA CCGGTCATAC 660 

GTAAGAACGA GGGGTGTAGG TAAAGTCCCC AGGAAAAAGG TGAACGTCAA AGTTTTCATT 720 

30 ATCATTGCTG TATTCTTTAT TTGTTTTGTT CCTTTCCATT TTGCCCGAAT TCCTTACACC 780 

CTGAGCCAAA CCCGGGATGT CTTTGACTGC ACTGCTGAAA ATACTCTGTT CTATGTGAAA 840 
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GAGAGCACTC TGTGGTTAAC TTCCTTAAAT GCATGCCTGG ATCCGTTCAT CTATTTTTTC 900 

CTTTGCAAGT CCTTCAGAAA TTCCTTGATA AGTATGCTGA AGTGCCCCAA TTCTGCAACA 960 

TCTCTGTCCC AGGACAATAG. GAAAAAAGAA CAGGATGGTG GTGACCCAAA TGAAGAGACT 1020 

CCAATGTAA ~\ ~ ~ "~ ~' ' ~ Z ^ V~"~ ' ~ • ~ ~ • " ' . ~" • " ~ " 1029 



' 5 . (35) INFORMATION; FOR SEQ ID NO: 34 : _ ; ' 11 , : '". ^' _ _ / ... 

(i) SEQUENCE CHARACTERISTICS : ; V; .... 
;(A) . LENGTH: 342 amino acids . ' / 

(B) TYPE: amino acid 

(C) STRANDEDNESS : . ; .:. . : . 

10 - (D) TOPOLOGY: not relevant < ' 

. ' _ . (ii) MOLECULE TYPE: protein • 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: ' ' 

Met Gin Ala Val Asp Asn Leu Thr Ser Ala Pro Gly Asn Thr Ser Leu 

'." 1 • 5 . 10 • ' is - ' 

15 Cys Thr Arg Asp Tyr Lys lie Thr Gin Val Leu Phe Pro Leu Leu Tyr 

[ : 20 , ' - 25 ,. 30 

Thr Val Leu Phe Phe Val Gly Leu lie Thr Asn. Gly Leu Ala Met Arg 

. • ' - 35 ■ . :* ' ■ 40 ■ ' , ' " • 45 ' . '/\ -v,' 

He Phe Phe Gin He Arg Ser Lys Ser Asn Phe He lie Phe Leu Lys 
20 50 ' ; 55 60 

Asn Thr Val lie Ser Asp Leu Leu Met lie Leu Thr Phe Pro Phe Lys 

" 65 70 /. 75. y • --80 . 

lie Leu Ser Asp Ala Lys Leu Gly Thr Gly Pro. Leu Arg Thr Phe Val 
85 90 > 95 

25 C Y S Gin Val Thr Ser Val lie Phe* Tyr Phe Thr Met Tyr He Ser lie ' 

. 100 ; 105 no 

Ser Phe Leu Gly Leu lie Thr He Asp Arg Tyr Gin Lys Thr Thr Arg, 
115 : 120 125 \ ■ 

Pro Phe Lys Thr Ser Asn Pro Lys Asn Leu Leu Gly Ala Lys lie Leu 

3 °. ....... ; 130, , : . 135.;.... ■ , 140 - : 

Ser Val Val He Trp Ala. Phe. Met Phe Leu Leu Ser Leu Pro Asn Met 
145 150 155 160 

He Leu Thr Asn Arg Gin Pro Arg Asp Lys Asn Val Lys Lys Cys Ser 
. . / - 165, :, .; 170 , • • - 175 -.I'-:" 



35 



Phe Leu Lys Ser Glu Phe Gly Leu Val Trp/His' Glu He Val Asn Tyr 
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. . 180 . ■ . 185 , -190 

lie Cys Gin Val lie Phe Trp lie Asn Phe Leu He Val He Val Cys 
195 200 205 

Tyr Thr Leu He Thr Lys Glu Leu Tyr Arg Ser Tyr Val Arg Thr Arg 
5 210 215 . 220 ' 

Gly Val Gly Lys Val Pro Arg Lys Lys Val Asn Val Lys Val Phe lie 
225 230 235 240 

He He Ala Val Phe Phe lie Cys Phe Val Pro Phe His Phe Ala Arg 

245 , 250 ... 255 

10 He Pro Tyr Thr Leu Ser Gin Thr Arg Asp Val Phe Asp Gys Thr Ala 

260 265 270 

Glu Asn Thr Leu Phe Tyr Val Lys Glu Ser Thr Leu Trp Leu Thr Ser 
275 280 285 

Leu Asn Ala Cys Leu Asp Pro Phe He Tyr Phe Phe Leu Cys Lys Ser 
15 290 295 300 

Phe Arg Asn Ser Leu lie Ser Met Leu Lys Cys Pro Asn Ser Ala Thr 
305 310 315 320 

Ser Leu Ser Gin Asp Asn Arg Lys Lys Glu Gin Asp Gly Gly Asp Pro 
325 330 335 

20 Asn Glu Glu Thr Pro Met 

340 

(36) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 
\ (A) LENGTH: 1077 base pairs 
25 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: ' SEQ ; ID NO: 35:. 

30 ATGTCGGTCT GCTACCGTCC CCCAGGGAAC GAGACACTGC TGAGCTGGAA GACTTCGCGG 60 

GCCACAGGCA CAGCCTTCCT GCTGCTGGCG GCGCTGCTGG GGCTGCCTGG CAACGGCTTC 120 

GTGGTGTGGA GCTTGGCGGG CTGGCGGCCT GCACGGGGGC GACCGCTGGC GGCCACGCTT 180 

GTGCTGCACC TGGCGCTGGC CGACGGCGCG GTGCTGCTGC TCACGCCGCT CTTTGTGGCC 240 

TTCCTGACCC GGCAGGCCTG GCCGCTGGGC CAGGCGGGCT GCAAGGCGGT GTACTACGTG 300 



. WO 00/22131 : ' Vx?-. ■.'■;■■■ ■>■ 

PCT/US99/2406S 

; . v.. ." , . 43 - / ; ■.. )" " 

TGCGCGCTCA GCATGTACGC CAGCGTGCTG CTCACCGGCC TGCTCAGCCT GCAGCGCTGC 
;" CTCGCAGTCA CCCGCCCCTT. CCTGGCGCCT CGGCTGCGCA GCCCGGCCCT GGCCCGCCGC 
; CTGCTGCTGG CGGTCTGGCT GGCCGCCCTG TTGCTCGCCG TCCCGGCCGC CGTCTACCGC 
; . '. • . *~^S ( ~?:9'F^P\ ?i?GACCGCGT . ATGCCAGCTG- TGCCACCCGT CGCCGGTCCA . CGCCGCCGCC 
; 5-CACCTGAGCC TGGAGACTCT GACCGCTTTC GTGCTTCCTT TCGGGCTGAT GCTCGGCTGC ^ 
. TACAGCGTG A CGCTGGCACG GCTGCGGGGC GCCCGCTGGG ; GCTCCGGGCG GCACGGGGCG 
CGGGTGGGCC GGCTGGTGAG CGCCATCGTG CTTGCCTTCG GCTTGCTCTG GGCCCCCTAC 
V; CACGCAGTCA ACCTTCTGCA GGCGGTCGCA GCGCTGGCTC CACCGGAAGG GGCCTTGGCG 
AAGCTGGGCG GAGCCGGCCA GGCGGCGCGA GCGGGAACTA CGGCCTTGGC CTTCTTCAGT 
10 TCTAGCGTCA ACCCGGTGCT CTACGTCTTC ACCGCTGGAG ATCTGCTGCC CCGGGCAGGT 
: CCCCGTTTCC TCACGCGGCT CTTCGAAGGC TCTGGGGAGG CCCGAGGGGG CGGCCGCTCT 
• . AGGGAAGGGA CCATGGAGCT CCGAACTACC CCTCAGCTGA AAGTGGTGGG GCAGGGCCGC 
. GGCAATGGAG ACCCGGGGGG TGGGATGGAG . AAGGACGGTC CGGAATGGGA CCTTTGA ! 
(37) INFORMATION FOR SEQ ID NO: 36: . 

15 ... (i). SEQUENCE CHARACTERISTICS : 

• (A) LENGTH: 358 amino acids ' ' 

(B) TYPE: amino acid 
'(C) STRANDEDNESS : 
(D) TOPOLOGY: not relevant 



. 360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1077 



20 



25 



30 



, (ii) MOLECULE TYPE : protein ^ .. 

_ (xi) SEQUENCE DESCRIPTION: SEQ ID NO:36: . . ' 

Met Ser Val Cys Tyr, Arg Pro Pro Giy Asn Glu Thr Leu Leu Ser Trp 

.Lys Thr Ser Arg Ala' Thr Gly Thr Ala Phe Leu Leu Leu Ala Ala ^ Leu 

... • " 25 • .-■ ■ - ... 30- 

Leu Gly Leu Pro Gly Asn Gly Phe Val Val Trp Ser Leu Ala Gly Trp 

• Arg Pro Ala Arg .Gly Arg > ro Leu Ala Ala Thr Leu Val Leu His Leu 

, ; 55 ■ 60 • 

Ala Leu Ala Asp Gly Ala Val Leu Leu Leu Thr Pro Leu Phe Val Ala 

-. \- " * ■ <-\ ■■75..... V.. , ■ so 

_.Phe .Leu...Thr-Arg Gin- A-l-a-Trp- Pro ^1^-^ ^ ~ 
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85 90 95 

Val Tyr Tyr Val Cys Ala Leu Ser Met Tyr Ala Ser Val Leu Leu Thr 
100 105 110. 

Gly Leu Leu Ser Leu Gin Arg Cys Leu Ala Val Thr Arg Pro -Phe Leu 
5 115 120 125 

Ala Pro Arg Leu Arg Ser Pro Ala Leu Ala Arg Arg Leu Leu Leu Ala 
130 135 140 

Val Trp Leu Ala Ala Leu Leu Leu Ala Val Pro Ala Ala Val Tyr Arg 
145 150 155 • . 160 

10 His Leu Trp Arg Asp Arg Val Cys Gin Leu Cys His Pro Ser Pro Val 

165 170 175 

His Ala Ala Ala His Leu Ser Leu Glu Thr Leu Thr Ala Phe Val Leu 
180 185 190 

Pro Phe Gly Leu Met Leu Gly Cys Tyr Ser Val Thr Leu Ala Arg Leu 
15 195 ' 200 205 

• ■ ' ■ ■ . * 

Arg Gly Ala Arg Trp Gly Ser Gly Arg His Gly Ala Arg Val Gly Arg 
210 215 ... 220 

Leu Val Ser Ala lie Val Leu Ala Phe Gly Leu Leu Trp Ala Pro Tyr 
225 230 235 240 

20 His Ala Val Asn Leu Leu Gin Ala Val Ala Ala Leu Ala Pro Pro Glu 

245 250 255 

Gly Ala Leu Ala Lys Leu Gly Gly Ala Gly Gin Ala Ala Arg. Ala Gly 
260 265 270 

Thr Thr Ala Leu Ala Phe Phe Ser Ser Ser Val Asn Pro Val Leu Tyr 
25 275 280 285 

Val Phe Thr Ala Gly Aspi Leu Leu Pro Arg Ala Gly Pro Arg Phe Leu 
290 295 300 

Thr Arg Leu Phe Glu Gly Ser Gly Glu Ala Arg Gly Gly Gly Arg Ser 
305 310 315 320 



30 Arg Glu Gly Thr Met Glu Leu Arg Thr Thr Pro Gin Leu Lys Val Val 

325 330 335 

Gly Gin Gly Arg Gly Asn Gly Asp Pro Gly Gly Gly Met Glu Lys Asp 
340 345 350 

Gly Pro Glu Trp Asp Leu * 
35 ... 355 



(38) INFORMATION FOR SEQ ID NO:37: 
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(i) SEQUENCE' CHARACTERISTICS: ; : . - , V 

(A) LENGTH: 1005 base pairs 

; , (B) - TYPE: nucleic acid •• ' '-V -V'. ' 

■ (C) STRANDEDNESS: single 

5 . (D) TOPOLOGY : linear : ■ ■ 

» ; . . ( "? M0LECULE TY ?E: DNA (genomic) J-.l - 

: ^^.^i*?^!^^ id no : 37-. . '."....„•. ! 

• ATGCTGGGGA TCATGGCATG • ■ GAATGCAACT TGCAAAAACT GGCTGGCAGC AGAGGCTGCC 
; ■ CTGGAAAAGT ACTACCTTTC CATTTTTTAT GGGATTGAGT TCGTTGTGGG AGTCCTTGGA 
10 • AATACCATTG TTGTTTACGG . CTACATCTTC TCTCTGAAGA ACTGGAACAG CAGTAATATT 
.TATCTCTTTA ACCTCTCTGT CTCTGACTTA GCTTTTCTGT GCACCCTCCC CATGCTGATA 
, AGGAGTTATG^ CCAATGGAAA CTGGATATAT GGAGACGTGC TCTGCATAAG CAACCGATAT 
GTGCTTCATG CCAACCTCTA TACCAGCATT CTCTTTCTCA CTTTTATCAG CATAGATCGA 
TACTTGATAA TTAAGTATCC TTTCCGAGAA CACCTTCTGC ' AAAAGAAAGA GTTTGCTATT 
.5 TTAATCTCCT TGGCCATTTG GGTTTTAGTA ACCTTAGAGT TACTACCCAT ACTTCCCCTT 
: :ataaatcctg TTATAACTGA caa^c acctgtaatg ATTTTGCAAG TTCTGGAGAC 
• , CCCAACTACA ACCTCATTTA CAGCATGTGT CTAACACTGT TGGGGTTCCT TATTCCTCTT 
TTTGTGATGT GTTTCTTTTA TTACAAGATT GCTCTCTTCC TAAAGCAGAG GAATAGGCAG 
' GTTGCTACTG CTCTGCCCCT TGAAAAGCCT CTCAACTTGG TCATCATGGC AGTGGTAATC 
lO TTCTCTGTGC TTTTTACACC CTATCACGTC ATGCGGAATG TGAGGATCGC .TTCACGCCTG 
^.^Wac^;^^ GTGCACTCAG GTCGTCATCA ACTCCTTTTA CATTGTGACA ' 
CGGCCTTTGG CCTTTCTGAA CAGTGTCATC AACCCTGTCT TCTATTTTCT TTTGGGAGAT ' 
; ^^^v^"^ ACATGCTGAT GAATCAACTG AGACACAACT TCAAATCCCT TACATCCTTT 
• AGCAGATGGG CTCATGAACT CCTACTTTCA TTCAGAGAAA AGTGA ■ . ! ■ " 

(39) INFORMATION FOR SEQ ID NO: 38: • • ■ " : 

(i) SEQUENCE CHARACTERISTICS • 
. (A) LENGTH : 334 amino acids ' 

(B) TYPE: amino acid 

(C) STRANDEDNESS: •• 

. (D) TOPOLOGY: not relevant • ; ; ; 

... MOLECULE TYPE: protein ' - ', 



60 
120 
180 
240 
300 
360 
420 
480 
540 . 
600 
-660 
720 
780 
840 ' 
900 - 
960 
1005 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:38: 

Met Leu Gly lie Met Ala Trp Asn Ala Thr Cys Lys Asn Trp Leu Ala 
1 5 10 15 

Ala Glu Ala Ala Leu Glu Lys Tyr Tyr Leu Ser lie Phe Tyr Gly lie 
20 25 30 

Glu Phe Val Val Gly Val Leu Gly Asn Thr He Val Val Tyr Gly Tyr 
35 40 45 

lie Phe Ser Leu Lys Asn Trp Asn Ser Ser. Asn He. Tyr Leu Phe Asn 
50 55 60 



Leu Ser Val Ser Asp Leu Ala Phe Leu Cys Thr. Leu Pro. Met Leu lie 
65 70 75 80 

Arg Ser Tyr Ala Asn Gly Asn Trp lie Tyr Gly Asp Val Leu Cys He 
85 90 95 

Ser Asn Arg Tyr Val Leu His Ala Asn Leu Tyr Thr Ser He Leu Phe 
100 105 110 

Leu Thr Phe lie Ser He Asp Arg Tyr Leu He He Lys Tyr Pro Phe 
115 . . 120 125 

Arg Glu His Leu Leu Gin Lys Lys Glu Phe Ala He Leu lie Ser Leu 
130 135 140 

Ala He Trp Val Leu Val Thr Leu Glu Leu Leu Pro He Leu Pro Leu 
145 150 155 160 

lie Asn Pro Val He Thr Asp Asn Gly Thr Thr Cys Asn Asp Phe Ala, 
165 170 175 

Ser Ser Gly Asp Pro Asn Tyr Asn Leu He Tyr Ser Met Cys Leu Thr 
180 185 • • 190. 

Leu Leu Gly Phe Leu He Pro Leu Phe Val Met Cys Phe Phe Tyr Tyr 
195 200 . 205 

Lys He Ala Leu Phe Leu Lys Gin Arg Asn Arg Gin Val Ala Thr Ala 
210 215 220 

Leu Pro Leu Glu Lys Pro Leu Asn Leu Val He Met Ala Val Val He 
225 230 235 240 

Phe Ser Val Leu Phe Thr Pro Tyr His Val Met Arg Asn Val Arg lie 
245 250 255 

Ala Ser Arg Leu Gly Ser Trp Lys Gin Tyr Gin Cys Thr Gin Val Val 
260 265 270 

He Asn Ser Phe Tyr He Val Thr Arg Pro Leu Ala Phe Leu Asn Ser 



10 
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' ' 275 : • .280 ; 'V' ' r ./7 -'-^ . 

• ■ ^ Pr ° Val Tyr Phe Leu Leu ' .Gly Asp His Phe Arg Asp : ' 
. . ^t Leu Met Asn. Gin Leu Arg His Asn Phe Lys Ser Leu Thr Ser Phe " : 

•.. 305 : ^ , 3 * 0 • / -V ■ , . 315... . . , 320 . . 

••V Ser Arg Trp Ala His Glu Leu Leu Leu Ser Phe Arg Glu Lys 

.. : ... . ;. ■ 325 -•: w .... .330 : . : X r :..:^ -v 

(40) INFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1296 base ; pairs •.'>'• 

• (B) TYPE: nucleic acid . . ■ -. 

(C) STRANDEDNESS : single ' ■ " 

(D) TOPOLOGY: linear . 

(ii). MOLECULE TYPE: DNA (genomic) 
.15 ' (xi) SEQUENCE DESCRIPTION: SEQ ID NO:39: 

^ : ATGCAGGCGC TTAACATTAC CCCGGAGCAG TTCTCTCGGC TGCTGCGGGA CCACAACCTG 
• . ACGCGGGAGC AGTTCATCGC TCTGTACCGG; CTGCGACCGC TCGTCTACAC CCCAGAGCTG 
CCGGGACGCG CCAAGCTGGC CCTCGTGCTC ACCGGCGTGC TCATCTTCGC CCTGGCGCTC- 
' - ^^*®.^AATG CTCTGGTGTT CTACGTGGTG ACCCGCAGCA AGGCCATGCG CACCGTCACC 
20 AACATCTTTA TCTGCTCCTT GGCGCTCAGT GACCTGCTCA TCACCTTCTT CTGCATTCCC 
: GTCACCATGC TCCAGAACAT TTCCGACAAC TGGCTGGGGG GTGCTTTCAT TTGCAAGATG 
GTGCCATTTG TCCAGTCTAC CGCTGTTGTG ACAGAAATGC TCACTATGAC CTGCATTGCT 
. GTGGAAAGGC ACCAGGGACT TGTGCATCCT TTTAAAATGA AGTGG CAATA CACCAACCGA 
. AGGGCTTTCA CAATGCTAGG TGTGGTCTGG CTGGTGGCAG TCATCGTAGG ATCACCCATG 
15 TGGCACGTGC AACAACTTGA GATCAAATAT GACTTCCTAT ATGAAAAGGA A^ACATCTGC ■ 
- TGCTTAGAAG AGTGGACCAG CCCTGTGCAC . CAGAAGATCT ACACCACCTT CATCCTTGTC ' 
ATCCTCTTCC TCCTGCCTCT TATGGTGATG CTTATTCTGT ACAGTAAAAT TGGTTATGAA 
CTTTGGATAA AGAAAAGAGT TGGGGATGGT TCAGTGCTTC GAACTATTCA TGGAAAAGAA , 
■■ ... ATGTCCAAAA TAGCCAGGAA GAAGAAACG A GCTGTCATTA TGATGGTGAC AGTGGTGGCT 
0 CTCTTTGCTG TGTGCTGGGC ACCATTCCAT GTTGTCCATA TGATGATTGA ATACAGTAAT 
TTTGAAAAGG AATATGATGA TGTCACAATC AAGATGATTT TTGCTATCGT GCAAATTATT 



60 
12 0 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
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GGATTTTCCA ACTCCATCTG TAATCCCATT GTCTATGCAT TTATGAATGA AAACTTCAAA 1020 

AAAAATGTTT TGTCTGCAGT TTGTTATTGC ATAGTAAATA AAACCTTCTG TCCAGCACAA 1080 

AGGCATGGAA ATTCAGGAAT TACAATGATG CGGAAGAAAG CAAAGTTTTC CCTCAGAGAG 1140 

AATCCAGTGG AGGAAACCAA AGGAGAAGCA TTCAGTGATG GCAACATTGA AGTCAAATTG 1200 

TGTGAACAGA CAGAGGAGAA GAAAAAGCTC AAACGACATC TTGCTCTCTT TAGGTCTGAA 1260 

CTGGCTGAGA ATTCTCCTTT AGACAGTGGG CATTAA . 1296 



(41) INFORMATION FOR SEQ ID NO: 40: 

. (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 431 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

. (D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 

Met Gin Ala Leu Asn lie Thr Pro Glu Gin Phe Ser Arg Leu Leu Arg 
1 5 10 15 

Asp His Asn Leu Thr Arg Glu Gin Phe lie Ala Leu Tyr Arg Leu Arg 
20 25 30 

Pro Leu Val Tyr Thr Pro Glu Leu Pro Gly Arg Ala Lys. Leu Ala Leu 
35 40 45 

Val Leu Thr Gly Val Leu lie . Phe Ala Leu„ Ala .Leu Phe Gly Asn Ala 
50 55 60 

Leu Val Phe Tyr Val Val Thr Arg Ser Lys Ala Met Arg Thr Val Thr 
65 70 75 80 

Asn lie Phe lie Cys Ser Leu Ala Leu Ser Asp Leu Leu lie Thr Phe 
85 90 95 

Phe Cys lie Pro Val Thr Met Leu Gin Asn He Ser Asp Asn Trp Leu 
100 105 110 

Gly. Gly Ala Phe lie Cys Lys Met Val Pro Phe Val. Gin. Ser Thr Ala 
115 120 125 

Val Val Thr Glu Met Leu Thr Met Thr Cys He Ala Val Glu Arg His 
130 135 140 

Gin Gly Leu Val His Pro Phe Lys Met . Lys Trp Gin Tyr Thr Asn Arg 
145 * 150 155 160 



Arg Ala Phe, Thr Met Leu' Gly Val val Trp Leu .Val Ala Val He Val 
165 . . 170 175 

< /■ Gly Ser Pro Met Trp His Val Gin Gin Leu Glu lie Lys Tyr Asp Phe 

180 : \.>"\.-- 185 190 

. . T X r . ? 1 ^/Lys..Glu His lie Cys Cys_ Leu ; Glu _Glu_Trp ^Thr SeriPro, 

195 200 • . 205 

, Val Hi s Gin Lys lie ..Tyr Thr Thr Phe He Leu Val lie _Leu - Phe Leu 
210 215 \ 220; 

Leu Pro . Leu Met Val Met Leu lie Leu Tyr Ser Lys' . lie Gly Tyr Glu 

225 .'. . 230 ; ; 235 , ; 240 

Leu Trp lie Lys Lys. Arg. Val Gly, Asp Gly Ser Val Leu Arg Thr lie 

.. ' 245 / V : 250 *• 255 

His Gly Lys Glu Met Ser Lys lie Ala Arg Lys Lys Lys Arg Ala Val 
260 .. 265 270 

lie Met Met Val Thr Val Val Ala Leu Phe Ala Val, Cys Trp Ala Pro 
275 280 285 

• phe His v ^l Val His Met Met lie Glu Tyr Ser Asn Phe Glu Lys Glu 
290 295. 300 

••; Asp Asp Val Thr lie Lys Met lie Phe Ala lie Val Gin He lie 

.305 310 . - . 315 

Gly Phe Ser Asn Ser lie Cys Asn Pro He Val 
. 325 ' 330 

Glu Asn Phe Lys Lys Asn Val Leu Ser Ala Val 
340 345 

. Asn Lys Thr Phe Ser Pro Ala Gin Arg His Gly 
355 360 ; \ 

Met Met Arg Lys Lys Ala Lys Phe Ser Leu. Arg 
> 370 375 , 

Glu Thr Lys Gly Glu Ala Phe Ser Asp Gly Asn 
; 385 390 395 

; Cys Glu Gin Thr Glu Glu Lys Lys Lys Leu Lys 

405 . 410 

Phe Arg Ser Glu Leu Ala Glu Asn Ser Pro Leu 
420 425 

(420 INFORMATION FOR SEQ ID NO : 41 : 



( i j SEQUENCE CHARACTERISTICS : 
(A) LENGTH: 24 base pairs 



320 

Tyr Ala Phe Met Asn 
335 

Cys Tyr Cys He Val 
350 

Asn Ser Gly lie Thr 

365 • 

Glu Asn Pro Val . Glu 
380 . 

He Glu Val Lys Leu 
400 

Arg- His Leu Ala Leu 
'415 ... 

Asp. Ser Gly His 
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(B) TYPE.: nucleic acid 
■ (C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



5 ' {xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41: 

CTGTGTACAG CAGTTCGCAG AGTG . 24 

.(43) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS : 
(A) LENGTH: 24 base pairs 
10 (B) TYPE: nucleic acid 

. (C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:42: 
15 GAGTGCCAGG CAGAGCAGGT AGAC \ 24 

(44) INFORMATION FOR SEQ ID NO: 43: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31. base pairs 

(B) TYPE: nucleic acid 
20 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

: (ii) MOLECULE TYPE: DNA (genomic) 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:43: 
25 CCCGAATTCC TGCTTGCTCC CAGCTTGGCC C 31 

(45) INFORMATION FOR SEQ ID NO:44:* 

(i) SEQUENCE CHARACTERISTICS: 
. (A) LENGTH: 32 base pairs 
(B) TYPE: nucleic acid 
30 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(iv) ANTI- SENSE: YES 
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• (Xi) SEQUENCE DESCRIPTION* SEQ ID NO: 44 : 

. TGTGGATCCT GCTGTCAAAG GTCCCATTCC GG : . 

(46) INFORMATION FOR SEQ' ID NO: 45: . 

. ; (i) SEQUENCE CHARACTERISTICS: • . 
^ P . (A) LENGTH: 20 base pairs : 

— (B) TYPE :' nucleic acid ^ •• * 

. (C) STRANDEDNESS : single 
(D) TOPOLOGY: linear 

: (ii). MOLECULE TYPE: DNA (genomic) 

10 (iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45: 
TCACAATGCT AGGTGTGGTC \ " ' 

(47) INFORMATION FOR SEQ ID NO: 46: 
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32 



15 



(i) SEQUENCE CHARACTERISTICS : •• 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

; . (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

.- (ii) MOLECULE TYPE: DNA (genomic) 

20 (iv) ANTI- SENSE: YES 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 46 

TGCATAGACA ATGGGATTAC AG 

(48) INFORMATION FOR SEQ ID NO: 47 : 

;. (i) SEQUENCE CHARACTERISTICS: 
25 (A) LENGTH: 511 base pairs 

.(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear ' 

(ii) MOLECULE TYPE: DNA (genomic) ' / 



20 



22 



30 * (xi) SEQUENCE DESCRIPTION: SEQ' ID NO: 47 : ' 

; ;. TCACAATGCT -AGGTGTGGTC TGGCTGGTGG CAGTCATCGT AGGATCACCC ATGTGGCACG 60 
^^^'^k^ TGAGATCAAA TATGACTt6c TATATGAAAj\ GGAACACATC TGCTGCTTAG 120 
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AAGAGTGGAC CAGCCCTGTG CACCAGAAGA TCTACACCAC CTTCATCCTT GTCATCCTCT 180 

TCCTCCTGCC TCTTATGGTG ATGCTTATTC TGTACGTAAA ATTGGTTATG AACTTTGGAT 240 

AAAGAAAAGA GTTGGGGATG GTTCAGTGCT TCGAACTATT CATGGAAAAG AAATGTCCAA 300 

AATAGCCAGG AAGAAGAAAC GAGCTGTCAT TATGATGGTG ACAGTGGTGG CTCTCTTTGC 360 

TGTGTGCTGG GCACCATTCC ' ATGTTGTCCA TATGATGATT GAATACAGTA ATTTTGAAAA 420 

GGAATATGAT GATGTCACAA TCAAGATGAT TTTTGCTATC GTGCAAATTA TTGGATTTTC 480 

CAACTCCATC TGTAATCCCA TTGTCTATGC A 511. 
(49) INFORMATION FOR SEQ ID NO: 48: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:48: 
CTGCTTAGAA GAGTGGACCA G_ ,21 
(50) INFORMATION FOR SEQ ID NO :49 : 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 22 base pairs . 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single . - 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iv) ANTI-SENSE: NO : ^ 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49: 

CTGTGCACCA GAAGATCTAC AC 22 

(51) INFORMATION FOR SEQ ID NO: 50:- 

(i) SEQUENCE CHARACTERISTICS : 
..(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



21 
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' •;' (ii) ' MOLECULE TYPE : DNA (genomic) \ : ' V ■ 

; (iv) ANTI- SENSE : YES ;'" : ;V /V -' ;<•:' ■ - : y\'v : _ . 

• • ' (xi) SEQUENCE DESCRIPTION: SEQ ID NO : 50 : . 4 ■ ■ ■ . 

CAAGGATGAA GGTGGTGTAG A 

5- ( 52) INFORMATION" FOR- SEQ ID NO : 51: - * ' '; r : " - . ' 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs / v . * : 

r \ , (B) TYPE: nucleic acid * . - Y 

■ (C) STRAND EDNESS : single Y ; \ ' ~ (' " 
\ 10 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iv) ANTI-SENSE: YES : / ' : ^, 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:51: 

GTGTAGATCT TCTGGTGCAC AGG ' 

. 15 (53) INFORMATION FOR SEQ ID NO : 52 : 

(i) SEQUENCE CHARACTERISTICS: ' 
(A) LENGTH: 21 base pairs 
: (B) TYPE: nucleic acid . 
- . (C) STRANDEDNESS: single ' > 

20 ; (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) \. ^ . 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO : 52: , 
' GCAATGCAGG TCATAGTGAG C . ' 

(54) .INFORMATION, FOR SEQ ID NO: 53 : 

25 ■■ (i) SEQUENCE CHARACTERISTICS:/ 

(A) LENGTH: 27 base pairs' ' v 

(B) TYPE: nucleic acid, . ' 

(C) STRANDEDNESS:. single' * • * 

(D) TOPOLOGY: linear ... 

30 (ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: YES . . 

(iv) ANT I - SENSE : YES . ^ 1 ' ' ,,v;: 
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(xi) SEQUENCE DESCRIPTION: ,SEQ ID NO:53: 

TGGAGCATGG TGACGGGAAT GCAGAAG 

(55) INFORMATION FOR SEQ ID NO: 54: 

(i) SEQUENCE CHARACTERISTICS: 
■ (A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: - single 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: DNA {genomic) 
(iv) ANTI-SENSE: YES 



27 



(xi) SEQUENCE DESCRIPTION: SEQ , ID NO: 54: 

GTGATGAGCA GGTCACTGAG CGCCAAG 

(56) INFORMATION FOR SEQ ID NO:55: 

(i) SEQUENCE CHARACTERISTICS: 
15 (A) LENGTH: 23 base pairs . 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
20 (iv) ANTI-SENSE: NO 



27 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:55: 

GCAATGCAGG CGCTTAACAT TAG 

(57) INFORMATION FOR SEQ ID NO:56: 

(i) SEQUENCE CHARACTERISTICS: 
25 (A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid J 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
30 (iv) ANTI- SENSE: YES 
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. (xi) SEQUENCE DESCRIPTION: SEQ ID NO:56: 
TTGGGTTACA ATCTGAAGGG CA 



22 
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. (58) INFORMATION FOR SEQ ID NO:57: 

(i) SEQUENCE CHARACTERISTICS : : :\"» • 
< . • , (A) LENGTH: 23 base pairs J 

(B) TYPE: nucleic acid 
5 + (C) . STRANDEDNESS : single 

- " . — " (d) TOPOLOGY f linear T " ~- 

{ ii ) MOLECULE TYPE : DNA (genomic) 

(iv) ANTI- SENSE: NO " 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57: 
.10 ACTCCGTGTC CAGCAGGACT CTG 

(58) INFORMATION FOR SEQ ID NO:58: 

' <i) SEQUENCE CHARACTERISTICS : ' ; > 

(A) LENGTH: 24 base pairs ' 

(B) TYPE: nucleic acid. 
15 (C) STRANDEDNESS: single 

■ (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iv) ANTI-SENSE: YES 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58 : 
20 , TGCGTGTTCC TGGACCCTCA CGTG 

(58) INFORMATION FOR SEQ ID NO:59: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 base pairs 

(B) TYPE: nucleic acid ' ' \ ■' 
25 (C) STRANDEDNESS: single 

• (D) TOPOLOGY: linear 

' (ii) MOLECULE TYPE: DNA (genomic) • 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59: 

30 CAGGCCTTGG ATTTTAATGT XAGGGATGG: 

(61) INFORMATION FOR SEQ ID NO: 60 : 

fi) SEQUENCE CHARACTERISTICS: > ' 
(A) LENGTH: 27 base pairs 
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(B) TYPE: nucleic acid • ; / • . - 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear * • •, - : / 

(ii) MOLECULE TYPE: DNA (genomic) ' 
5 (iv) ANTI-SENSE: YES . • _ 

(xi) SEQUENCE DESCRIPTION: . SEQ ID NO: 60: 
GGAGAGTCAG CTCTGAAAGA ATTCAGG 27 

(62) INFORMATION FOR SEQ ID NO: 61: 

(i) SEQUENCE CHARACTERISTICS : 
10 '. . . (A) . LENGTH: 27 base pairs 

- (B) TYPE: nucleic acid / ) ; 

(C) STRANDEDNESS: single" , •' 

. (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) : 

\5 (iv) ANT I -SENSE: NO ' ' " 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:61: 
. TGATGTGATG CCAGATACTA ATAGCAC 27 

(63) INFORMATION FOR SEQ ID NO: 62: ' 

(i) SEQUENCE CHARACTERISTICS: • • 
20 (A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid - - 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear- 

(ii) MOLECULE TYPE: DNA (genomic) - ' ' 
25 (iv) ANTI-SENSE: YES 

(xi) SEQUENCE DESCRIPTION:" SEQ ID NO: 62: 
CCTGATTCAT TTAGGTGAGA TTGAGAC 27 

(64) INFORMATION FOR SEQ ID NO: 63: • 

(i) SEQUENCE CHARACTERISTICS : . 
30 . •. . (A) LENGTH: 26 base pairs. 

-(B) TYPE: nucleic acid ; 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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. / (ii) MOLECULE TYPE: DNA (genomic) *• 1 

\ ' (xi) SEQUENCE DESCRIPTION.: SEQ ID NO: 63: , . / \ '. 

CCCAAGCTTC CCCAGGTGTA TTTGAT t , , 1 26 

. A (3) INFORMATION FOR SEQ ID NO: 63.: . * 

5 " - (iT SEQUENCE CHARACTERISTICS: 7 ' " 7 "".'""7 ''''' 7 - — ••- — 

"• " (A) LENGTH: 26 .base' pairs. 

V; . .. (B) TYPE: nucleic, acid. ;\ 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear . s ' ; V; \ ^ 
10 r (ii) MOLECULE TYPE: DNA (genomic) , V : : .• . 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 64 : 

GTTGGATCCA CATAATGCAT TTTCTC /' ./ 26 

(66) INFORMATION FOR SEQ ID NO: 65: 

: (i) SEQUENCE CHARACTERISTICS: '; " . . 

15 (A) LENGTH: 1080 base pairs '.; - - 

; (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single : - 

(D) TOPOLOGY: , linear .. 

(ii) MOLECULE TYPE: DNA (genomic) /■"..■'• 

20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 65: 

\ ATGATTCTCA. ACTCTTCTAC TGAAGATGGT ATTAAAAGAA TCCAAGATGA TTGTCCCAAA 60 

/ GCTGGAAGGC ATAATTACAT ATTTGTCATG ATTCCTACTT TATACAGTAT CATCTTTGTG 120 

. GTGGGAATAT TTGGAAACAG CTTGGTGGTG ATAGTCATTT ACTTTTATAT GAAGCTGAAG 180 

ACT GTGGCCA GTGTTTTTCT TTTGAATTTA GCACTGGCTG ACTTATGCTT TTTACTGACT 240. 

25 TTGCCACTAT GGGCTGTCTA CACAGCTATG GAATACCGCT GGCCCTTTGG CAATTACCTA 300 

TGTAAGATTG CTTCAGCCAG CGTCAGTTTC AACCTGTACG CTAGTGTGTT TCTACTCACG 360 

TGTCTCAGCA TTGATCGATA CCTGGCTATT GTTCACCCAA TGAAGTCCCG CCTTCGACGC 420^ 

, ACAATGCTTG TAGCCAAAGT CACCTGCATC ATCATTTGGC TGCTGGCAGG CTTGGCCAGT '480 

TTGCCAGCTA TAATCCATCG AAATGTATTT TTCATTGAGA ACACCAATAT TACAGTTTGT 540 

30 . GCTTTCCATT ATGAGTCCCA AAATTCAACC CTTCCGATAG GGCTGGGCCT GACCAAAAAT 600 
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ATACTGGGTT TCCTGTTTCC TTTTCTGATC ATTCTTACAA GTTATACTCT TATTTGGAAG 660 
GCCCTAAAGA AGGCTTATGA AATTCAGAAG AACAAACCAA GAAATGATGA TATTTTTAAG 720 
ATAATTATGG CAATTGTGCT TTTCTTTTTC TTTTCCTGGA TTCCCCACCA AATATTCACT . 780 
TTTCTGGATG TATTGATTCA ACTAGG CATC ATACGTG ACT GTAGAATTGC AGATATTGTG 840 
5 GACACGGCCA TGCCTATCAC CATTTGTATA GCTTATTTTA ACAATTGCCT GAATCCTCTT 900 
TTTTATGGCT TTCTGGGGAA AAAATTTAAA AGATATTTTC. TCCAGCTTCT AAAATATATT 960 
CCCCCAAAAG CCAAATCCCA CTCAAACCTT TCAACAAAAA TGAGCACGCT TTCCTACCGC 1020 
CCCTCAGATA ATGTAAGCTC ATCCACCAAG AAGCCTGCAC CATGTTTTGA GGTTGAGTGA 1080 

(67) INFORMATION FOR SEQ ID NO:66: 

10 (i) SEQUENCE CHARACTERISTICS : 

(A) .LENGTH: 359 amino acids 

(B) TYPE: amino acid 
..(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant . . 

15 (ii) MOLECULE TYPE: protein 



(xi) * SEQUENCE DESCRIPTION: SEQ ID NO:66: 

Met lie Leu Asn Ser Ser Thr Glu Asp Gly lie Lys Arg lie Gin Asp 
1 5 10 15 

Asp Cys Pro Lys Ala Gly Arg His . Asn Tyr lie Phe Val Met lie Pro 
20 20 ; ' ■ 25 30 

Thr Leu Tyr Ser lie lie Phe Val Val Gly lie Phe Gly Asn Ser Leu 
35 40 45 

Val Val lie Val lie Tyr Phe Tyr Met Lys Leu Lys Thr Val Ala Ser 
50 " 55 60 

25 Val Phe Leu Leu Asn Leu Ala Leu Ala Asp Leu Cys' Phe Leu Leu Thr 

65 70 75 80 

Leu Pro Leu Trp Ala Val Tyr Thr Ala Met Glu Tyr Arg Trp Pro Phe 
85 90 95 

Gly Asn Tyr Leu Cys Lys lie Ala Ser Ala Ser Val Ser Phe Asn Leu 
30 100 105 . 110 

Tyr Ala Ser Val Phe Leu Leu Thr Cys Leu Ser He Asp Arg . Tyr Leu 
115 120 ' 125 

Ala He Val His Pro Met Lys Ser Arg Leu Arg Arg Thr Met Leu Val 
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130 .■ ■ 135 



140 



Ala Lys Val Thr Cys lie lie" lie Trp^Leu Leu Wa'cly lieu Ala Ser 



145 ' 150 ,. 155 



160 



. .. Leu Pro Ala . lie . lie His Arg Asn val Phe Phe lie Glu Asn Thr Asn 



lie Thr Val cys Ala Phe His Tyr Glu Ser Gin Asn Ser Thr 



18 0 . 185 



Leu Pro " 



190 



lie Gly Leu Gly Leu Thr Lys Asn lie Leu Gly Phe' Leu Phe Pro Phe 

195 ' 200 . . . . 205 -,- - : 

Leu lie lie Leu Thr Ser Tyr Thr Leu lie Trp Lys Ala Leu Lys Lys 
. . , 215 220 

Ala Tyr Glu lie Gin Lys Asn Lys Pro Arg Asn Asp Asp lie Phe Lys 

225 ' ; ' 230 . . ,.. 235..-;,; .. . 2 4 0 

He lie Met Ala lie Val Leu Phe Phe Phe Phe Ser Trp lie Pro His 
245 - 250 255 . 

Gin lie Phe Thr Phe Leu Asp Val Leu lie Gin < Leu Gly lie lie Arq 

- 260 •-. 265 ,-270 • . - 

Asp Cys Arg lle'Ala- Asp lle val Asp Thr Ala Met Pro lie Thr lie 



275 280 



285 



< . 078 Ala ^ Phe Asn As » Cys Leu Asn Pro. Leu Phe Tyr Gly Phe 

' 295 ; 300 

Leu Gly Lys Lys Phe Lys Arg Tyr Phe Leu Gin Leu Leu Lys Tyr lie 

305 310 • ; V \, • ,,.315-. , : ; 320 

Pro Pro Lys Ala Lys Ser His Ser Asn Leu. Ser Thr Lys Met Ser Thr 

• - 325 _ . : " 330 ; 335 

Leu Ser Tyr Arg. Pro Ser Asp Asn Val Ser Ser Ser Thr Lys Lys Pro 



345 



350 



Ala Pro Cys Phe Glu Val Glu 

-355-; - 

(68) INFORMATION FOR SEQ ."ID" -NO': 67 : " 

(i) . SEQUENCE CHARACTERISTICS : 
. (A) LENGTH:' 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single < 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
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; (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 67: 
ACCATGGGCA GCCCCTGGAA CGGCAGC - 27 
(69) INFORMATION FOR SEQ ID NO: 68: 

(i) SEQUENCE CHARACTERISTICS : ' . 

(A) LENGTH: 39 base pairs 

(B) . TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear. 

(ii) MOLECULE TYPE: DNA (genomic) : ". ■ - ' "-" 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO : 68 : 
AGAACCACCA CCAGCAGGAC GCGGACGGTC TGCCGGTGG . • 39 

(70) INFORMATION FOR SEQ ID NO: 69: 

' (i) SEQUENCE CHARACTERISTICS : : 

(A) LENGTH: 39 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 69: 
GTCCGCGTCC TGCTGGTGGT GGTTCTGGCA TTTATAATT 39 

(71) INFORMATION FOR SEQ ID NO:70: 

(i) SEQUENCE CHARACTERISTICS: : " 

(A) LENGTH:' 33 base pairs • • . 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: DNA (genomic) ' \ 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 70: \ ... . 

CCTGGATCCT TATCCCATCG TCTTCACGTT AGC ^ ■ 33 

(72) INFORMATION FOR SEQ ID NO:71: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs _ 
. (B) TYPE: nucleic acid 
(C) STRANDEDNESS : single 
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\ (D) TOPOLOGY: linear - • : * ;/■';' » ■'■ 

- (ii) MOLECULE TYPE : DNA (genomic) : : ' : : - . : \' " : \ ' -v : 

. ' , (iv) ANTI-SENSE: NO ' - " ' . \. '.' 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 71 : : 

- ^ CTGGAATTCT CCTGCCAGCA TGGTGA ~ " : ? : ^ - — - — • 

'26 '• . •' • • . .' \ ' r ' v -'.-':' ;V; ••• < «• * : 

(73) INFORMATION FOR SEQ ID NO: 72: \ 

(i) SEQUENCE CHARACTERISTICS: ^ 

(A) LENGTH: 30 base pairs V ' " ; , : "\ V ' ^ ' 

<B) TYPE: nucleic acid ' " . ' : ' " 

(C) STRANDEDNESS: single 
. (D) TOPOLOGY: linear * : . "• ^ ' 

(ii) MOLECULE TYPE : DNA : (genomic) ; 

(iv) ANTI-SENSE: YES V 

. (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 72: 

GCAGGATCCT ATATTGCGTG CTCTGTCCCC 

30 . - ' • - ■ •. . ^; " . ' '/ '• ' ' • ' • • / • .- ' , y 

(74) INFORMATION FOR SEQ ID NO:73: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 999 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single ■ ' 
' . . ' (D) TOPOLOGY: linear ' 

(ii)* MOLECULE TYPE: DNA (genomic); • ' ' 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 73*: 

ATGGTGAACT CCACCCACCG TGGGATGCAC ACTTCTCTGC ACCTCTGGAA CCGCAGCAGT 60 

TACAGACTGC ACAGCAATGC CAGTGAGTCC CTTGGAAAAG GCTACTCTGA TGGAGGGTGC 120 

TACGAGCAAC TTTTTGTCTC TCCTGAGGTG TTTGTGACTC TGGGTGTCAT ' CAGCTTGTTG 180 

GAGAATATCT TAGTGATTGT GGCAATAGCC AAGAACAAGA ATCTGCATTC ACCCATGTAC 240 

TTTTTCATCT GCAGCTTGGC TGTGGCTGAT ATGCTGGTGA GCGTTTCAAA TGGATCAGAA 300 

ACCATTATCA TCACCCTATT AAACAGTACA GATACGGATG CACAGAGTTT CACAGTGAAT .360 

ATTGATAATG TCATTGACTC * GGTGAT C TGT AGCTCCTTGC TTGCATCCAT TTGCAGCCTG. 420 



-. r WO 00/22131 PCT/US99/24065 

. -62- 

CTTTGAATTG CAGTGGACAG GTACTTTACT ATCTTCTATG CTCTCCAGTA CCATAACATT 480 

AtGAGAGTTA AGCGGGTTGG GATCAGCATA AGTTGTATCT GGGCAGCTTG CACGGTTTCA 540 

GG CATTTTGT TCATCATTTA CTCAGATAGT AGTGCTGTCA TCATCTGCCT CATCACCATG -600 

TTCTTCACCA TGCTGGCTCT CATGGCTTCT CTCTATGTCC ACATGTTCCT GATGGCCAGG 660 

5 CTTCACATTA AGAGGATTGC TGTCCTCCCC GGCACTGGTG CCATCCGCCA AGGTGCCAAT 720 

ATGAAGGGAG CGATTACCTT GACCATCCTG ATTGGCGTGT TTGTTGTCTG CTGGGCCCCA 780 

TTCTTCCTCC ACTTAATATT CTACATCTCT TGTCCTCAGA ATCCATATTG TGTGTGCTTC 840 

ATGTCTCACT TTAACTTGTA TCTCATACTG ATCATGTGTA ATTCAATCAT GGATCCTCTG 900 

ATTTATGCAC TCCGGAGTCA AGAACTGAGG AAAACCTTCA AAGAGATCAT CTGTTGCTAT . 960 

10 CCCCTGGGAG GCCTTTGTGA CTTGTCTAGC AGATATTAA 999 

(75) INFORMATION FOR SEQ ID NO: 74:. 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 332 amino acids 
~(B) TYPE: amino acid " 
15 (C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:74: 

Met Val Asn Ser Thr His Arg Gly Met His Thr Ser Leu His Leu Trp . 
20 1 5 10 15 

Asn Arg Ser Ser. Tyr Arg Leu His Ser Asn Ala Ser Glu Ser Leu Gly 
20 25 30 

Lys Gly Tyr Ser Asp Gly Gly Cys Tyr Glu Gin Leu Phe Val Ser Pro 
35 40 45 

25 Glu Val Phe Val . Thr Leu Gly Val He Ser Leu Leu Glu Asn He Leu 

50 55 60 

Val He Val Ala lie Ala Lys Asn Lys Asn Leu His Ser Pro Met Tyr 
65 . . ■ 70. 75 80 

Phe Phe lie Cys Ser Leu Ala Val Ala Asp Met Leu Val Ser Val Ser 
30 85 90 95 

Asn Gly Ser Glu Thr He He He Thr Leu Leu Asn Ser Thr Asp Thr 
100 105 ■ - - 1X0 

J. Asp Ala Gin Ser. Phe Thr Val .Asn lie Asp Asn. Val He Asp Ser Val 
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■ ' '■:; V v- :• 1:L * ' 4': ^V^ 20 ' 'J ' " ' ■ '125 ■' . . / 

- lie Cys Ser Ser Leu Leu Ala Ser He Cys Ser Leu Leu Ser He Ala' 

•* " , 130 /•• ' 135 . 140 

. Val Asp Arg Tyr. Phe Thr He Phe Tyr Ala- Leu Gin Tyr His Asn He 

.' 145 . .; iso ,• ' 155 160 

Met Thr Val Lys Arg Val^Gly lie Ser- lie Ser Cys lie Trp Ala Ala :•'-■' 
165 - 170 .. - 175 

Cys Thr Val Ser Gly He Leu Phe He lie Tyr Ser Asp Ser Ser Ala 

-■■ ±8Q \ ■ " 185 . 190 

Val He lie cys Leu He Thr Met Phe Phe Thr Met Leu Ala Leu Met 

1 - s 200 205 . 

Ala Ser Leu Tyr Val His Met Phe Leu Met Ala Arg Leu His lie Lys : 

210 ' • • ' " 215 ' 220 

: Arg lie Ala Val Leu Pro Gly Thr Gly Ala lie Arg Gin Gly Ala Asn 

15 .' : 225 230 -235 240 

Met Lys Gly Ala lie Thr Leu Thr He Leu He Gly Val Phe Val' Val 
■ 245 '. 250 ' - ' . 255 

Cys Trp Ala Pro . Phe Phe Leu His. Leu lie Phe Tyr. He Ser Cys Pro 
■'• 260 265 . '■ . 270 

20 Gln Asn p ro Tyr Cys Val Cys Phe Met Ser His Phe Asn Leu Tyr Leu 

275 280 , 285 

lie Leu He Met Cys Asn Ser He He Asp Pro Leu lie Tyr Ala Leu 
290 295 300 

Arg Ser Gin Glu Leu Arg Lys Thr Phe Lys Glu He lie Cys Cys Tyr 
25 . 305 . ■ . 310 V. 315 ,320 ■' 

Pro Leu Gly Gly Leu Cys Asp Leu Ser Ser Arg Tyr 
•-*"'■ ./ ' ' '325 ' • ; .V/'-'" '330 / . ':-';\_ '■ 

. (76) INFORMATION FOR SEQ ID NO: 75: 

(i) SEQUENCE CHARACTERISTICS: • : 

30 (A) LENGTH: 32 base pairs' 

.(B) TYPE: nucleic acid- ". 
(C) STRANDEDNESS : single ' " " "■ '. 
. (D) TOPOLOGY:, linear 

(ii) MOLECULE ' TYPE: DNA (genomic) . '/ 

35 *(xi) SEQUENCE DESCRIPTION: SEQ ID NO:75: ' " ' - : - -.' ' 

CCGAAGCTTC GAGCTGAGTA " AGGCGGCGGG "CT ' " " ' " ' ' 32 
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(77) INFORMATION FOR SEQ ID NO: 76: 

(i) SEQUENCE CHARACTERISTICS : . 

(A) LENGTH: 31 base pairs " 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single. 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 76: 
GTGGAATTCA TTTGCCCTGC CTCAACCCCC A 
10 (78) INFORMATION FOR SEQ ID NO: 77: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1344 base pairs .. 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

15 (D) TOPOLOGY: linear ■*. ./ ' 

(ii) MOLECULE TYPE: DNA (genomic) 



31 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:77: 

ATGGAGCTGC TAAAGCTGAA CCGGAGCGTG CAGGGAACCG GACCCGGGCC GGGGGCTTCC 60 

CTGTGCCGCC CGGGGGCGCC TCTCCTCAAC AGCAGCAGTG TGGGCAACCT CAGCTGCGAG -120 

20 CCCCCTCGCA TTCGCGGAGC CGGGACACGA GAATTGGAGC TGGCCATTAG AATCACTCTT . 180 

.TACGCAGTGA TCTTCCTGAT GAGCGTTGGA GGAAATATGC TCATCATCGT GGTCCTGGGA 240 

CTGAGCCGCC GCCTGAGGAC TGTCACCAAT GCCTTCCTCC TCTCACTGGC AGTCAGCGAC 300 

CTCCTGCTGG CTGTGGCTTG CATGCCCTTC ACCCTCCTGC CCAATCTCAT GGGCACATTC 360 

. ATCTTTGGCA CCGTCATCTG CAAGGCGGTT TCCTACCTCA TGGGGGTGTC TGTGAGTGTG 420 

25 TCCACGCTAA GCCTCGTGGC CATCGCACTG GAGCGATATA GCGCCATCTG* CCGACCACTG 480 

CAGGCACGAG TGTGGCAGAC GCGCTCCCAC GCGGCTCGCG TGATTGTAGC CACGTGGCTG 540 

CTGTCCGGAC • TACTCATGGT GCCCTACCCC GTGTACACTQ TCGTGCAACC AGTGGGGCCT 600 

CGTGTGCTGC AGTGCGTGCA TCGCTGGCCC AGTGCGCGGG TCCGCCAGAC CTGGTCCGTA 660 

CTGCTGCTTC TGCTCTTGTT CTTCATCCCA GGTGTGGTTA TGGCCGTGGC CTACGGGCTT 720 

30 ATCTCTCGCG AGCTCTACTT AGGGCTTCGC TTTGACGGCG ACAGTGACAG CGACAGCCAA 780 

AGCAGGGTCC GAAACCAAGG CGGGCTGCCA GGGGCTGTTC ACCAGAACGG GCGTTGCCGG 840 
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CCTGAGACTG * GCGCGGTTGG. CAAAGACAGC . 


GATGGCTGCT : 


ACGTGCAACT ; 


TCCACGTTCC 


900 


CGGCCTGCCC TGGAGCTGAC GGCGCTGACG 


GCTCCTGGGC 


CGGGATCCGG 


CTCCCGGCCC 


,960 


: \ ACCCAGGCCA AGCTGCTGGC TAAGAAGCGC GTGGTGCGAA TGTTGCTGGT GATCGTTGTG .- J 


1020 


1. _ , CTTTT.TTTTCl TGTGTTGGTT GCCAGTTTAT. 


AGTGCCAACA_ 


_CGTGGCGCGC_ 


CTTTGATGGCL _ 




V 5 CCGGGTGCAC ACCGAGCACT CTCGGGTGCT 


CCTATCTCCT 


TCATT CACTT 


GCTGAGCTAC 


1140 


GCCTCGGCCT GTGTCAACCC CCTGGTCTAC 


TGCTTCATGC 


ACCGTCGCTT 


TCGCCAGGCC 


1200 


TGCCTGGAAA CTTGCGCTCG CTGCTGCCCC 


CGGCCTCCAC 


GAGCTCGCCC 


CAGGGCTCTT 


1260 


, CCCGATGAGG ACCCTCCCAC TCCCTCCATT 


GCTTCGCTGT 


CCAGGCTTAG 


CTACACCACC 


1320 


ATCAGCACAC TGGGCCCTGG CTGA 








1344 



".10 (79) INFORMATION FOR SEQ ID NO;78: 

(i) SEQUENCE CHARACTERISTICS : " 

(A) LENGTH: 447 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

15 (D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE : protein. ; 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 78 : \ 

Met Glu Leu Leu Lys Leu Asn Arg Ser Val Gin Gly Thr Gly Pro Gly . 

: ."'1. •■ * • 5 •.. « " ; * V • 10 . v ' ■ ' / ; 15 ' /'-'" 

20 J- Pro Gly Ala Ser Leu Cys Arg Pro Gly Ala Pro Leu Leu Asn Ser Ser 

- . • " ■ ' ; 20 . • • 25. ' ■ 30 ' : \ ■ 

' Ser Val Gly Asn Leu Ser Cys Glu Pro Pro Arg lie Arg Gly Ala Gly 

■ . • '• • ■ 40 • "•• • • • 45' ,, .. ; . • 

Thr Arg Glu' Leu Glu Leu Ala lie Arg lie thr Leu Tyr Ala Val lie 
25 50 55 60 

Phe Leu Met Ser Val Gly Gly Asn Met Leu, lie lie Val Val Leu Gly 

65 . 70 ' : 75 " ' 80 

( Leu Ser Arg Arg Leu Arg Thr Val Thr Asn Aia Phe Leu- Leu Ser Leu 
85 90 95 , 

30 Ala Val Ser Asp Leu Leu Leu . Ala Val Ala Cys Met Pro Phe Thr Leu 

100 105 . 110 

Leu Pro Asn Leu Met Gly Thr Phe lie Phe Gly Thr Val lie Cys Lys 

; . " ■ ' 115 120 > 125 ■ 
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Ala Val Ser Tyr Leu Met Gly .Val Ser Val Ser Val Ser Thr Leu Ser 
130 135 140 

Leu Val Ala lie Ala Leu Glu Arg Tyr Ser Ala lie Cys Arg Pro Leu 
145 150 155 160 

5 . Gin Ala Arg Val Trp Gin Thr Arg Ser His Ala Ala Arg Val lie Val 

165 ■ . 170 175 

* Ala Thr Trp Leu Leu Ser Gly Leu Leu Met Val Pro Tyr Pro Val Tyr 
180 185 190 

Thr Val Val Gin Pro Val Gly Pro Arg .Val Leu Gin Cys Val His Arg 
10 195 . 200 205 

Trp Pro Ser Ala Arg Val Arg Gin Thr Trp Ser Val Leu Leu Leu Leu 
210 215 220 

Leu Leu Phe Phe lie Pro Gly Val Val Met Ala Val Ala Tyr Gly Leu 
225 230 235 240 

15 lie Ser Arg Glu Leu Tyr Leu . Gly . Leu Arg Phe Asp Gly Asp Ser Asp 

245 250 255 

Ser Asp Ser Gin Ser Arg Val. Arg Asn Gin .Gly .Gly Leu Pro Gly Ala , 
260 265 270 

Val His Gin Asn Gly Arg Cys Arg Pro Glu Thr Gly Ala Val Gly Lys 
20 275 280 285 

Asp Ser Asp Gly Cys Tyr Val Gin Leu Pro Arg Ser Arg Pro Ala Leu 
290 295 300 

Glu Leu Thr Ala Leu Thr Ala Pro Gly Pro Gly Ser Gly Ser Arg Pro 
305 310 315 '320 

25 Thr Gin Ala Lys Leu Leu Ala Lys Lys Arg Val Val Arg Met Leu Leu 

325 _ 330 . . 335 

Val lie Val Val Leu Phe Phe Leu Cys Trp Leu Pro Val Tyr Ser Ala 
340 345 , 350 

Asn Thr Trp Arg Ala Phe Asp Gly Pro Gly Ala His Arg Ala Leu Ser 
30 355 360 365 

Val Ala Pro lie Ser Phe lie His Leu Leu Ser Tyr Ala Ser Ala Cys 

- 370 375 380 t 

Val Asn Pro Leu Val Tyr Cys Phe Met His Arg Arg Phe Arg Gin Ala 
385 390 395 400 

35 Cys Leu Glu Thr Cys Ala Arg Cys Cys Pro Arg Pro Pro Arg Ala Arg 

- ■ 405 • 410 - 415 ' 



Pro Arg Ala Leu Pro. Asp Glu Asp Pro Pro Thr Pro Ser lie Ala Ser 
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: ■■' ' '420' - • V ; 425 / : " T." ' .430 : ; 

* ■ Leu Ser Arg Leu Ser Tyr Thr Thr lie Ser Thr Leu Gly Pro Gly > 

. • ; ,' 435 440 • , 445 \ .. ... v~ , ; 

(80) INFORMATION FOR SEQ , ID NO: 79: 

, 5; '- ' (i) SEQUENCE CHARACTERISTICS : v 

- (A) • LENGTH: 30 base pairs ; "\ / /. . 5 

— - . (b)- TYPE: - nucleic acid ~ : _ ... . : -^'V-.- •- i . : : , 

'/: V (c) -STRANDEDNESS: ' single , V 
(D) TOPOLOGY: linear : • ' 

10 (ii) MOLECULE TYPE: DNA (genomic) . 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 79 : 
TGCAAGCTTA AAAAGGAAAA AATGAACAGC ; * ' ^ 30 . 

(81) INFORMATION FOR SEQ ID NO: 80: .. .. .. 

(i) SEQUENCE CHARACTERISTICS': 

15 (A) LENGTH: 30 base pairs /■ ;. 

(B) TYPE: nucleic acid . 

" . (c) STRANDEDNESS: single, 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) ': . 

20 (xi), SEQUENCE DESCRIPTION: SEQ ID NO : 80 : . ; / \ 

: TAAGGATCCC TTCCCTTCAA AACATCCTTG • .. 30 

■ (82) INFORMATION FOR SEQ ID NO: 81:. 

\. (i) SEQUENCE CHARACTERISTICS : - , 

(A) LENGTH: 1014 base pairs V/' / 
25 "(B) TYPE: nucleic acid 

; - ' (C) STRANDEDNESS: single. : 

(D) TOPOLOGY: linear _ * • v 

(ii) MOLECULE TYPE: DNA (genomic) . 

(xi) SEQUENCE DESCRIPTION: SEQ ID. NO: 81: 

30 ■: ATGAACAGCA CATGTATTGA AGAACAGCAT GACCTGGATC . ACTATTTGTT TCCCATTGTT 60 

TACATCTTTG TGATTATAGT CAGCATTCCA GCCAATATTG GATCTCTGTG TGTGTCTTTC 120 

" ■ CTGCAACCCA AGAAGGAAAG TGAACTAGGA ATTTACCTCT TCAGTTTGTC ACTATCAGAT 180 

/TTACTCTATG CATTAACTCT CCCTTTATGGVaTTGATTATA CTTGGAATAA AGACAACTGG 240 
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ACTTTCTCTC CTGCCTTGTG CAAAGGGAGT GCTTTTCTCA TGTACATGAA GTTTTACAGC 300 

AGCACAGCAT TCCTCACCTG CATTGCCGTT GATCGGTATT TGGCTGT.TGT CTACCCTTTG 360 

AAGTTTTTTT TCCTAAGGAC AAGAAGAATT GCACTCATGG TCAGCCTGTC CATCTGGATA 420 

TTGGAAACCA TCTTCAATGC TGTCATGTTG TGGGAAGATG AAACAGTTGT TGAATATTGC 480 

GATGGCGAAA AGTCTAATTT TACTTTATGC TATGACAAAT ACCCTTTAGA GAAATGGCAA 540 

ATCAACCTCA ACTTGTTCAG GACGTGTACA GGCTATGCAA TACCTTTGGT CACCATCCTG 600 

ATCTGTAACC GGAAAGTCTA CCAAGCTGTG CGGCACAATA AAGCCACGGA AAACAAGGAA 660 

AAGAAGAGAA TCATAAAACT AGTTGTCAGC ATCACAGTTA CTTTTGTCTT ATGCTTTACT 720 

CCCTTTCATG TGATGTTGCT GATTCGCTGC ATTTTAGAGC ATGCTGTGAA CTTCGAAGAC 780 

CACAGCAATT CTGGGAAGCG AACTTACACA ATGTATAGAA TCACGGTTGC ATTAACAAGT 840 

TTAAATTGTG TTGCTGATCC AATTCTGTAC TGTTTTGTTA CCGAAACAGG AAGATATGAT 900 

ATGTGGAATA TATTAAAATT CTGCACTGGG AGGTGTAATA CATCACAAAG ACAAAGAAAA 960 

CGCATACTTT CTGTGTCTAC AAAAGATACT ATGGAATTAG AGGTCCTTGA GTAG 1014 
(83) INFORMATION FOR SEQ ID NO: 82: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 337 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: • SEQ ID NO: 82: 

Met Asn Ser Thr Cys He Glu Glu Gin His Asp Leu Asp His Tyr Leu 
1 5 ' 10 15 

Phe Pro He Val Tyr He Phe Val He He Val Ser lie Pro Ala Asn 
20 25 > , 30 

lie Gly Ser Leu Cys Val Ser Phe Leu Gin Pro Lys Lys Glu Ser Glu 
35 40 : ; 45 

. Leu. Gly Jle Tyr Leu Phe Ser Leu Ser Leu -Ser Asp Leu Leu Tyr Ala 
50 55 60 

Leu' Thr Leu Pro Leu Trp lie Asp Tyr Thr Trp Asn Lys Asp Asn Trp 
65 70 75 , . 80 

. Thr Phe Ser Pro. Ala Leu Cys . Lys Gly Ser Ala Phe Leu Met Tyr Met 
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' Lys Phe Tyr Ser Ser Thr Ala Phe Leu Thr Cys * lie Ala Val Asp Arg , 

100 ; . . ' ' : ; 105 .' ... 110 

- Tyr Leu Ala Val Val Tyr Pro Leu Lys Phe Phe Phe Leu Arg Thr Arg 

• t 7 .;- -:- ~^ 5 - — - — - -— 120 ~ * ~ " ' " ? " : -™ " - I25 - --. 

Arg lie Ala Leu Met; Val Ser Leu Ser lie Trp lie Leu Glu Thr lie .J 

•'■'V-" V " 130 7 ~ V",.' 135 '• '•' ' '•• "14'P" 'V . "~ ^ 

Phe Asn Ala Val Met Leu Trp Glu Asp Glu Thr Val Val Glu Tyr Cys 
145 150 r 155 160 

Asp Ala Glu Lys Ser Asn Phe .Thr; Leu Cys Tyr Asp Lys Tyr Pro Leu 

165 / 170 • v ■ r ■ ■ 175 ' 

Glu Lys Trp Gin lie Asn Leu Asn Leu Phe Arg Thr Cys Thr Gly Tyr 
■18Q V \ • -.185.' ; . 190 '. '■ 

"~ Ala lie Pro Leu Val Thr lie Leu lie Cys Asn Arg Lys Val Tyr Gin 
195 200 205 

• Ala Val Arg His Asn Lys Ala Thr Glu Asn Lys Glu Lys Lys Arg lie 

< 210 " ' 215 • ;. 220 

He Lys Leu Leu Val Ser lie Thr Val Thr Phe Val Leu Cys Phe Thr 
225 230 235 ; 240 

Pro Phe His Val Met Leu Leu lie. Arg Cys He Leu Glu His Ala Val 
245 : . 250 255 

Asn Phe Glu Asp His Ser Asn Ser Gly Lys Arg Thr Tyr Thr Met Tyr 
260 265 270 

Arg He Thr Val Ala Leu Thr Ser Leu Asn Cys Val Ala Asp Pro He 
275 . 280 , 285 , 

Leu Tyr Cys Phe Val Thr Glu Thr Gly Arg Tyr Asp Met Trp Asn lie , 
290 \ 295 300 

Leu Lys Phe Cys Thr Gly Arg Cys Asn Thr Ser Gin Arg Gin Arg Lys 
305 310 315 320 

Arg lie Leu Ser Val Ser Thr,. Lys Asp. Thr Met Glu Leu Glu Val Leu 

* 325 ; ' [ ' ■ '330'; ' ' 335 

Giu ; • . ,. • . . 



(84) INFORMATION FOR SEQ ID NO: 83: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 40 base pairs 
-(B) TYPE: nucleic acid 
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(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

. (ii) MOLECULE TYPE: DNA (genomic)" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 83 : 

5 CAGGAAGAAG AAACGAGCTG TCATTATGAT GGTGACAGTG 
40 

(85) INFORMATION FOR SEQ ID NO:84: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 40 base pairs 
10 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 84: 

15 CACTGTCACC ATCATAATGA CAGCTCGTTT CTTCTTCCTG 
40 

(86) INFORMATION FOR SEQ ID NO: 85: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 30 base pairs 
20 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
. (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:85: 

25 GGCCACCGGC AGACCAAACG CGTCCTGCTG 
30 

(87) INFORMATION FOR SEQ ID NO: 86: 

(i) SEQUENCE CHARACTERISTICS : 
(A) LENGTH: 31 base pairs 
30 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) 



SEQUENCE DESCRIPTION :' SEQ * ID NO: 86: 
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CTCCTTCGGT CCTCCTATCG TTGTCAGAAG T . • : . ' <. ; 

\" 31 \' . •• : - ' •; ' ; . /. •. ; / . . . , ' \ ./\ ' ; \ •/ , :■ ;■ _ ,.■ 

(88) INFORMATION FOR SEQ ID NO: 87: \ - ; 

- " (i) SEQUENCE "CHARACTERISTICS : / - - — •-•-V-; ~ ~-r- - 

5 ; : - (A) . LENGTH: 3 7 base pairs " 

, (B) TYPE: nucleic acid \ : . . . ; r 

— -s ; :'-; ;- ; :(C)- STRANDEDNESS : s ingle -" l . v : ; * r~. 4 ^ ; • ^ - ^ ' - - : . - " : : *; 
; ' _ ■ ^ (D) TOPOLOGY: linear ;-'/ ■ : ; 

" (ii) MOLECULE TYPE: DNA (genomic) 

10 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 87: ' ' ; . 

GGAAAAGAAG AGAATCAAAA AACTACTTGT CAGCATC 37 

(89) INFORMATION FOR SEQ ID NO : 88 : - : • ■"*;' /. 

(i) SEQUENCE CHARACTERISTICS : 
• \ ; (A) LENGTH: 31 base pairs - ; . 

15 (B) TYPE: nucleic acid ' 

(C) STRANDEDNESS: single . . - , 

(D) TOPOLOGY: linear ' > 
( i i ) MOLECULE TYPE : DNA (genomic) ~ 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 88 : v 
20 CTCCTTCGGT CCTCCTATCG TTGTCAGAAG T . 31 

(90) INFORMATION FOR SEQ ID NO: 89: . / 

(i) SEQUENCE CHARACTERISTICS: 

. . (A) LENGTH: 1080 base, pairs : /' 
. (B) TYPE: nucleic acid 

25 (C). STRANDEDNESS: single •.'..'>, 

(D) TOPOLOGY: linear 

• (ii) MOLECULE TYPE: DNA _ (genomic) 

(xi) SEQUENCE DESCRIPTION:. SEQ ID NO : 89 : /. .\ e • 

ATGATTCTCA ACTCTTCTAC TGAAGATGGT ATTAAAAGAA TCCAAGATGA TTGTCCCAAA 60 

30 GCTGGAAGGC ATAATTACAT ATTTGTCATG ATTCCTACTT TATACAGTAT CATCTTTGTG 120 

GTGGGAATAT TTGGAAACAG CTTGGTGGTG ATAGTCATTT ACTTTTATAT ' GAAGCTGAAG 180 

ACTGTGGCCA GTGTTTTTCT TTTGAATTTA GCACTGGCTG ACTTATGCTT TTTACTGACT 240 

- TTGCCACTAT GGGCTGTCTA CACAGCTATG GAATACCGCT GGCCCTTTGG CAATTACCTA 300 
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TGTAAGATTG CTTCAGCCAG CGTGAGTTTC AACCTGTACG CTAGTGTGTT TCTACTCACG 360 

TGTCTCAGCA TTGATCGATA CCTGGCTATT GTTCACCCAA TGAAGTCCCG CCTTCGACGC 420 

ACAATGCTTG TAGCCAAAGT CACCTGCATC ATCATTTGGC TGCTGGCAGG CTTGGCCAGT 480 

TTGCCAGCTA TAATCCATCG AAATGTATTT TTCATTGAGA ACACCAATAT TACAGTTTGT 540 

5 GCTTTCCATT ATGAGTCCCA AAATTCAACC CTTCCGATAG GGCTGGGCCT GACCAAAAAT 600 

ATACTGGGTT TCCTGTTTCC TTTTCTGATC ATTCTTACAA GTTATACTCT TATTTGGAAG ' 660 

GCCCTAAAGA AGGCTTATGA AATTCAGAAG AACAAACCAA GAAATGATGA TATTAAAAAG 720 

ATAATTATGG CAATTGTGCT TTTCTTTTTC TTTTCCTGGA TTCCCCACCA AATATTCACT 780 

TTTCTGGATG TATTGATTCA ACTAGGCATC ATACGTGACT GTAGAATTGC AGATATTGTG 840 

10 GACACGGCCA TGCCTATCAC CATTTGTATA GCTTATTTTA ACAATTGCCT GAATCCTCTT 900 

TTTTATGGCT TTCTGGGGAA AAAATTTAAA AGATATTTTC TCCAGCTTCT AAAATATATT ' 960 

CCCCCAAAAG CCAAATCCCA CTCAAACCTT TCAACAAAAA TGAGCACGCT TTCCTACCGC 1020 

CCCTCAGATA ATGTAAGCTC ATCCACCAAG AAGCCTGCAC CATGTTTTGA GGTTGAGTGA 1080 

(91) INFORMATION FOR SEQ ID NO:90: : 

15 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 359 amino acids. 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

20 (ii) MOLECULE TYPE: protein 

. (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 90: 

Met lie Leu Asn Ser Ser Thr Glu Asp Gly lie Lys Arg lie Gin Asp 
1 5 10 15 

Asp Cys Pro Lys Ala Gly Arg His Asn Tyr lie Phe Val Met lie Pro 
25 20 25 30 

Thr Leu Tyr Ser lie lie Phe Val Val Gly lie Phe Gly Asn Ser Leu 
35 40 45 

Val Val lie Val lie Tyr .Phe Tyr Met' Lys Leu Lys Thr Val Ala Ser 
50 55 60 

30 Val Phe 'Leu Leu Asn Leu Ala Leu Ala Asp' Leu Cys Phe Leu Leu Thr 

65 . 70 . .. .. 75 ' 80 

Leu Pro Leu Trp Ala Val Tyr Thr Ala Met Glu Tyr Arg Trp Pro Phe 
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; ; - ; .. * 85 . " •, . 90 ;/ " ■ w -;.- : : 95 : , ; 

' Gly Asn Tyr Leu Cys Lys lie Ala Ser Ala Ser Val 'Ser Phe Asn Leu 

■C;-"" - : 100 / 105 110 

,Tyr Ala Ser Val Phe Leu : Leu Thr . Cys Leu Ser lie Asp Arg Tyr Leu 

' 5 ' 115 ' ■ 120 V .125 ' 

Ala lie Val His Pro Met .Lys Ser Arg Leu . Arg Arg Thr Met Leu Val 

/ • ' 130 ' : ' . 135 / 140- 

Ala Lys Val Thr Cys lie lie lie Trp Leu Leu Ala Gly Leu Ala- Ser 

145 < ' _ 150 ; • , • ' 155./ 160 V.'.. 

10 Leu Pro Ala lie lie His Arg Asn Val Phe Phe Ile .Glu Asn Thr Asn \ 

• V \ 1 » ' 165 170.; ; < ; . 175 \ 

He Thr Val Cys Ala Phe His Tyr Glu Ser Gin Asn Ser Thr Leu Pro 

■' Y,v . 180 ■'■185; _ " ■ 190 

" lie Gly Leu Gly 'Leu Thr Lys Asn lie Leu Gly Phe Leu Phe Pro Phe 

15 195 200 205 . „ 

Leu lie lie Leu Thr Ser Tyr Thr Leu lie, Trp Lys Ala Leu Lys Lys 

■''210.; 215 220 \ 

:. Ala; Tyr Glu lie Gin. Lys Asn Lys Pro Arg Asn Asp Asp lie . Lys Lys 

225 230 y 235 ; / 240 

20 lie lie Met Ala lie Val Leu Phe Phe Phe Phe Ser Trp, lie Pro His 

245 250 ; 255 

; Gin lie Phe Thr Phe Leu Asp Val Leu He Gin Leu Gly lie He Arg 

260 ' ' - ; . . ^ ; ' 265 -\... " 270 

Asp Cys Arg lie Ala Asp lie Val Asp; Thr Ala Met ; Pro lie Thr He 

25 275 ; : 280 285 

Cys lie Ala Tyr Phe Asn Asn .Cys Leu Asn Pro. Leu .Phe Tyr Gly Phe 

290 295 , 300 

\yheu- Gly Lys Lys Phe Lys Arg Tyr Phe Leu Gin Leu Leu Lys Tyr He 

305 310 315 ; 320 

30 ; Pro Pro Lys Ala Lys Ser His Ser Asn Leu Ser .Thr -Lys .-.Met Ser Thr 

325 330 335 

Leu. Ser Tyr Arg Pro Ser Asp Asn Val Ser Ser Ser Thr Lys Lys Pro . 

340 7 • 345 - ' 350 

Ala .Pro. Cys . Phe Glu Val Glu ^ . 

35 ■'■ ' - V ■ 355* : ■■■■ - \ \ • •• ■ ' .- _ ' 

(92) INFORMATION FOR SEQ ID NO: 91: • . , 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 5 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: * single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 91: 
CCAAGAAATG ATGATATTAA AAAGATAATT ATGGC 35 

(93) INFORMATION FOR SEQ ID NO : 92 : 

10 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs : 

(B) TYPE: nucleic acid 

* (C) STRANDEDNESS: single ■ ;/ 

. (D) TOPOLOGY: linear 

15 (ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ IDNO:92: 
CTCCTTCGGT CCTCCTATCG TTGTCAGAAG T 31 

(94) INFORMATION FOR SEQ ID NO: 93: • 

(i) SEQUENCE CHARACTERISTICS:' 
20 (A) LENGTH: 1080 base pairs 

..■'(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic)* 

25 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 93: ' 

ATGATTCTCA ACTCTTCTAC TGAAGATGGT ATTAAAAGAA TCCAAGATGA TTGTCCCAAA 60 

GCTGGAAGGC ATAATTACAT ATTTGTCATG ATTCCTACTT TATACAGTAT CATCTTTGTG 120 

GTGGGAATAT TTGGAAACAG CTTGGTGGTG ATAGTCATTT ACTTTTATAT GAAGCTGAAG 180 

ACTGTGGCCA GTGTTTTTCT TTTGAATTTA GCACTGGCTG ACTTATGCTT TTTACTGACT 240 

30 TTGCCACTAT GGGCTGTCTA CACAGCTATG GAATACCGCT GGCCCTTTGG CAATTACCTA 300 

TGTAAGATTG CTTCAGCCAG CGTCAGTTTC GCCCTGTACG CTAGTGTGTT TCTACTCACG 360 

TGTCTCAGCA TTGATCGATA CCTGGCTATT GTTCACCCAA TGAAGTCCCG CCTTCGACGC 420 
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; . ACAATGCTTG TAGCCAAAGT. CACCTGCATC ATCATTTGGG TGCTGGCAGG GTTGGCCAGT . 480 . 

, TTGCCAGCTA TAATCCATCG AAATGTATTT-' TTCATTGAGA ACACCAATAT TACAGTTTGT 540. :/\ • 

GCTTTCCATT ATGAGTCCCA AAATTCAACC CTTCCG ATAG GGCTGGGCCT GACCAAAAAT .600 

/ATACTGGGTT TCCTGTTfcC TTTTCTGATC ATTCTTACAA GTTATACTCT TATTTGGAAG 660 

5 . GCCCTAAAGA AGGCTTATGA: AATTCAGAAG AACAAACCAA GAAATGATGA . TATTTTTAAG t 720. _ 

ATAATT ATGG CAATTGTGCT TTTCTTTTTC TTTTCCTGGA TTCCCCACCA AATATTCACT 780 

TTTCTGGATG TATTGATTCA ACTAGGCATC ATACGTGACT GTAGAATTGC AGATATTGTG 840 

' GACACGGCCA TGCCTATCAC CATTTGTATA GCTTATTTTA ACAATTGCCT GAATCCTCTT 900 

TTTTATGGCT TTCTGGGGAA AAAATTTAAA AGATATTTTC TCCAGCTTCT AAAAT AT ATT 960 

10 CCCCCAAAAG CCAAATCCCA CTCAAACCTT TCAACAAAAA TGAGCACGCT TTCCTACCGC 1020 

CCCTCAGATA ATGTAAGCTC ATCCACCAAG AAGCCTGCAC CATGTTTTGA GGTTGAGTGA 1080/; 

(95) INFORMATION FOR SEQ ID NO: 94: ' 

... (i) SEQUENCE CHARACTERISTICS: ; 

(A) LENGTH: 3 59. amino acids - . " ^ ■ ■ ; v 
15 (B) TYPE: amino acid ' /• 
(C) STRANDEDNESS : ' 
\ . (D) TOPOLOGY: not relevant 

(ii) MOLECULE. TYPE: protein . ; 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 94 : 



20 



• Met lie Leu. Asn Ser Ser Thr .Glu Asp Gly lie Lys Arg lie Gin Asp 

■ V' 1 - 5 • 10 ; . " /, 15 • 

, - Asp Cys Pro Lys Ala Gly Arg His Asn Tyr lie Phe Val Met lie Pro 

■'■ 20 ■ ' 25- . •/ ' ■ 30, . . " '. 

Thr i Leu Tyr Ser lie lie Phe Val Val Gly lie Phe Gly Asn Ser Leu 
25. . 1 35 ' - 40 ' 45 

••* Val Val lie Val lie Tyr Phe Tyr Met Lys Leu Lys Thr Val Ala Ser 
' 50 55 ; 60 

Val Phe Leu Leu- Asn Leu Ala Leu Ala Asp Leu Cys Phe Leu Leu Thr 
65 70 '. 75 . 80 



30 



Leu Pro Leu Trp Ala Val' Tyr Thr Ala Met Glu Tyr Arg Trp Pro Phe 

; • - ' 85 - - \ • - v' ' r 90 - 95 - ' 

Gly Asn Tyr Leu Cys Lys lie Ala Ser : Ala Ser Val Ser Phe Ala Leu 
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100 105 110 

Tyr Ala Ser Val Phe Leu Leu Thr Cys Leu Ser lie Asp Arg Tyr Leu 
115 120 125 

Ala lie Val His Pro Met Lys Ser Arg Leu Arg Arg Thr Met Leu Val 
5 - 130 , 135 . 140 

Ala Lys Val Thr. Cys He He lie Trp Leu Leu Ala Gly Leu Ala Ser 
145 150 155 160 

Leu Pro Ala He lie His Arg Asn Val Phe Phe He Glu Asn Thr Asn 
165 - / 170 ' 175 

10 He Thr Val Cys^Ala Phe His Tyr Glu Ser Gin Asn Ser Thr Leu Pro 

180 185 190 

lie Gly Leu Gly Leu Thr Lys Asn He Leu Gly Phe Leu Phe Pro Phe 
- 195 '. 200 205 

Leu lie lie Leu Thr Ser Tyr Thr Leu He Trp Lys. Ala Leu Lys Lys 
15 210 215 220 

* Ala Tyr Glu He Gin Lys Asn Lys Pro Arg Asn Asp Asp He Phe Lys 
225 230 235 240 

He He Met Ala He Val Leu Phe Phe Phe Phe Ser Trp He Pro His 
245 250 255 

20 Gin He Phe Thr Phe Leu Asp Val Leu He Gin Leu Gly lie He Arg 

260 265 270 

Asp Cys Arg He Ala Asp lie Val Asp Thr Ala Met Pro lie Thr lie 
275 280 285 

Cys He Ala Tyr Phe Asn Asn Cys Leu Asn Pro Leu Phe Tyr Gly Phe 
25 . . ,290 295 300 

Leu Gly Lys Lys Phe Lys Arg Tyr Phe Leu Gin Leu Leu Lys Tyr lie 
305 310 315 ; 320 

Pro Pro Lys Ala Lys Ser His Ser Asn Leu Ser Thr Lys Met Ser Thr 
325 , 330 . 335 

30 Leu Ser Tyr Arg Pro Ser Asp Asn Val Ser Ser Ser Thr Lys Lys Pro 

• 340 . 345 350 

Ala Pro Cys Phe Glu Val Glu ' : 

355 

(97) INFORMATION FOR SEQ ID NO: 95: 

35 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs 

(B) TYPE : nucleic acid - 
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(C) STRANDEDNESS: 'single "• 7 . ' ' . : ' : 

. .. (D) TOPOLOGY: linear , \. 

(ii) MOLECULE TYPE: DNA (genomic) \ ' 

; ( iv) ANTI- SENSE :• NO _\: _ : . ± : Ji • _. _' _ 'I j ; : 

... 5 , . ' :(xi)j t SEQUENCE DESCRIPTION : , SEQ ID NO.: 95:. ; „ . V . _ '.J. 7 Z _ 

cccaagcttc CCCAGGTGTA TTTGAT 26 

\ (97)' INFORMATION FOR SEQ ID NO: 96: 

>• : .;' (i) SEQUENCE CHARACTERISTICS : • : ': ' 

(A) LENGTH: 29 base pairs . . J \ : *, 

10 (B) TYPE: nucleic acid 

" (C) STRANDEDNESS: single . 

; (D) TOPOLOGY: linear V. "0 ' v ; 

- (ii) MOLECULE TYPE: DNA (genomic) : \; 

(iv) ANTI- SENSE: YES . " . 

15 (xi)/. SEQUENCE DESCRIPTION: SEQ ID NO: 96 : 

CCTGCAGGCG AAACTGACTC TGGCTGAAG { 29 

(98) INFORMATION FOR SEQ ID NO: 97: 

(i) SEQUENCE CHARACTERISTICS: ■• ~ 

(A) LENGTH: 42 base pairs 
20 : (B) TYPE: nucleic acid . . , 

(C) STRANDEDNESS: single V : • V\ 

; /(D) TOPOLOGY: linear ' * ' \ ^ 

(ii) MOLECULE TYPE : DNA (genomic) . * . \ 

./. (iv) ANTI -SENSE: NO .' 

25 ' (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 97: 

"- ' CTGTACGCTA GTGTGTTTCT ACTCACGTGT CTCAGCATTG AT ' 42 

(99) INFORMATION FOR SEQ ID NO:98: * : 

(i) SEQUENCE CHARACTERISTICS : ' ' 
' (A) LENGTH: 26 base pairs 

30 .(B) TYPE: nucleic acid *" 

' (C) STRANDEDNESS: single, ' _ - fc 

(D) TOPOLOGY: linear ' * " 

: " (ii) MOLECULE TYPE: DNA (genomic) . - 
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(iv) ANTI-SENSE: YES 

(xi) SEQUENCE DESCRIPTION: SEQIDNO:98: 
GTTGGATCCA C ATAATG CAT TTTCTC ' . 26 
(100) INFORMATION FOR SEQ ID NO:99: 

5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1080 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

10 (ii) MOLECULE TYPE: DNA (genomic) 

. (xi) SEQUENCE DESCRIPTION: SEQ .ID NO:99: 

ATGATTCTCA ACTCTTCTAC TGAAGATGGT ATTAAAAGAA TCCAAGATGA TTGTCCCAAA 60 

GCTGGAAGGC ATAATTACAT ATTTGTCATG ATTCCTACTT TATACAGTAT CATCTTTGTG 120 

GTGGGAATAT TTGGAAACAG CTTGGTGGTG ATAGTCATTT ACTTTTATAT GAAGCTGAAG 180 

15 ACTGTGGCCA GTGTTTTTCT TTTGAATTTA GCACTGGCTG ACTTATGCTT TTTACTGACT* 240 

TTGCCACTAT GGGCTGTCTA CACAGCTATG GAATACCGCT GGCCCTTTGG CAATTACCTA 300 

TGTAAGATTG CTTCAGCCAG CGTCAGTTTC AACCTGTACG CTAGTGTGTT TCTACTCACG 360 

TGTCTCAGCA TTGATCGATA CCTGGCTATT GTTCACCCAA TGAAGTCCCG CCTTCGACGC 420 

ACAATGCTTG TAGCCAAAGT CACCTGCATC ATCATTTGGC TGCTGGCAGG CTTGGCCAGT 480 

20 TTGCCAGCTA TAATCCATCG AAATGTATTT TTCATTGAGA ACACCAATAT TACAGTTTGT 540 
GCTTTCCATT ATGAGTCCCA AAATTCAACC CTTCCGATAG GGCTGGGCCT GACCAAAAAT ' 600 

ATACTGGGTT TCCTGTTTCC TTTTCTGATC ATTCTTACAA GTTATTTTGG AATTCGAAAA 660 

CACTTACTGA AGACGAATAG CTATGGGAAG -AACAGGATAA CCCGTGACCA AGTTAAGAAG 720 

ATAATTATGG CAATTGTGCT 1 TTTCTTTTTC TTTTCCTGGA TTCCCCACCA AATATTCACT 780 

25 TTTCTGGATG TATTGATTCA ACTAGGCATC ATACGTGACT GTAGAATTGC AGATATTGTG 840 

GACACGGCCA TGCCTATCAC CATTTGTATA GCTTATTTTA ACAATTGCCT GAATCCTCTT 900 

TTTTATGGCT TTCTGGGGAA AAAATTTAAA AGATATTTTC TCCAGCTTCT AAAATATATT 960 

CCCCCAAAAG CCAAATCCCA CTCAAACCTT TCAACAAAAA TGAGCACGCT TTCCTACCGC 1020 

CCCTCAGATA ATGTAAGCTC ATCCACCAAG .AAGCCTGCAC CATGTTTTGA GGTTGAGTGA 1080 
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(101) INFORMATION FOR SEQ. ID NO : 100 : 

. (i) * SEQUENCE CHARACTERISTICS : ' : > 

(A) LENGTH : 359 amino acids 
[ (B) TYPE: amino acid 

5 ' " : (c)^ : "sf raMednes^ ~ 7' y \ ~" 

(D) TOPOLOGY: not relevant \- . ; 
( i i ) MOLECULE TYPE : protein ' ~ / 

. (xi). SEQUENCE DESCRIPTION: SEQ ID NO: 100: 

Met lie Leu Asn Ser Ser Thr Glu Asp Gly lie Lys Arg lie Gin Asp 

10 l . ■ : 5 — ' 10 •' 15 

Asp Cys Pro Lys Ala Gly Arg His Asn Tyr lie Phe, Val Met lie Pro 

;.: : \ ' 20 ' ■ 25" ■'■ " ; ' 30 

Thr Leu Tyr Ser lie lie Phe Val .Val Gly lie Phe Gly Asn Ser Leu 

* / 35 40 . ' ' . 45 .. V 

15 Val Val lie Val lie Tyr Phe Tyr. Met Lys Leu Lys Thr Val Ala Ser 

* . ' 50 . 55 ' • ■ 60 

Val Phe Leu Leu Asn Leu Ala Leu Ala Asp Leu Cys Phe Leu Leu Thr 
65 70 ' 75 80 

Leu Pro Leu Trp Ala Val Tyr Thr Ala Met Glu Tyr Arg Trp Pro Phe 
20 85 90 \ 95 

/Gly Asn Tyr Leu Cys Lys lie Ala Ser Ala Ser Val Ser Phe Asn Leu 

: . , 100 -. 105 • . 110 ; 

Tyr Ala Ser Val Phe Leu Leu Thr Cys Leu Ser lie Asp Arg Tyr Leu 

-.lis' v 120 ;"".:■/: . 125 ' 

25 Ala lie' Val His Pro Met Lys Ser Arg Leu Arg Arg Thr Met Leu Val 

130 V-'V.'-' ■ 135 . ■: ; 140 

Ala Lys Val Thr Cys lie lie lie Trp Leu Leu Ala Gly Leu Ala Ser 
145 . 150 \ . 155 160 

Leu Pro Ala lie lie" His Arg Asn Val Phe* Phe lie Glu Asn Thr Asn 
30 165 ,, . \. 170 . > 175 

lie Thr Val Cys Ala Phe His Tyr Glu Ser Gin Asn Ser Thr Leu Pro 
180 : ; 185 ,. 190 

lie Gly Leu Gly Leu Thr Lys Asn lie Leu Gly Phe Leu Phe Pro. Phe 
195 200/ 205 v 

35 . * Leu lie lie Leu Thr Ser Tyr Phe , Gly lie. Arg Lys His Leu Leu Lys 

210 215 220 
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Thr.Asn Ser Tyr Gly Lys Asn Arg He Thr Arg. Asp Gin Val Lys Lys 
225 230 t 235 240 

He He Met Ala lie Val Leu Phe Phe Phe Phe Ser Trp He Pro His 
245 250 255 

5 Gin lie Phe Thr Phe Leu Asp Val Leu He Gin Leu Gly lie -lie Arg 

260 _ > 265 270 

Asp Cys Arg. He Ala Asp lie. Val Asp Thr Ala Met Pro lie Thr lie 
. 275 280 285 

Cys lie Ala Tyr Phe Asn Asn Cys Leu Asn . Pro Leu Phe Tyr Gly Phe 
10 290 395 300 

Leu Gly Lys Lys Phe Lys Arg Tyr Phe Leu Gin Leu Leu Lys Tyr He 
305 310 315 320 

Pro Pro Lys Ala Lys Ser His Ser Asn Leu Ser Thr Lys Met Ser Thr 
325 330 335 

15 Leu Ser Tyr. Arg Pro Ser Asp Asn Val Ser Ser Ser Thr Lys Lys Pro 

340 345 350 

Ala Pro Cys Phe Giu Val Glu 
355 

(102) INFORMATION FOR SEQ ID NO: 101: 

20 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 37 base pairs - 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

25 (ii) MOLECULE TYPE: DNA (genomic) 

(iv) ANTI-SENSE: YES 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 101: ' 
TCCGAATTCC AAAATAACTT GTAAGAATGA TCAGAAA " 37 

(103) INFORMATION FOR SEQ ID NO:102: 

30 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 
"(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



35 (ii) MOLECULE TYPE: DNA (genomic) 



(iv) ANTI-SENSE: NO' 
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;(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 102: . 

' ; ; AGATCTTAAG . AAGATAATTA TGGCAATTGT GCT . ' : > : 33 

- r (104 ) INFORMATION ^FOR SEQ ID'NO : 103 :~ ^ ^ ^ - " - ~_ — .r - - - • v - 

(i) SEQUENCE CHARACTERISTICS: \ 
^r- -5- >r - (A) ~ LENGTH: 62 base pairs ^ V" : ' . ; ; ' : . V ^---r ~~ 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear ; 

(ii) MOLECULE TYPE: DNA (genomic) . ; ; ' \\ ' . 

f' 10 (iv) ANTI- SENSE: NO ,• V ' 

• • * v (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 103 : *" . : \ 

AATTCGAAAA CACTTACTGA AGACGAATAG CTATGGGAAG AACAGGATAA CCCGTGACCA 60 

" AG / • • ' . , '•; ; / ' • , • • • • - 62 

(105) * INFORMATION FOR SEQ ID NO : 104 : ' 1 ' " "■ ^ ' 

15 / (i) SEQUENCE CHARACTERISTICS: ' ' / 

(A) LENGTH: 62 base pairs - ' 

(B) TYPE: nucleic acid 

/ (C) STRANDEDNESS : single \ 

(D) TOPOLOGY: linear- ' 

20 (ii) MOLECULE TYPE: DNA (genomic) ^ : \ 

(iv) ANTI-SENSE: YES ' ■ 

' (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 104: V • 
TTAACTTGGT CACGGGTTAT CCTGTTCTTC CCATAGCTAT TCGTCTTCAG TAAGTGTTTT 60 

CG ; . •• , ; . ; ; . 62 

25 (106) INFORMATION FOR. SEQ ID NO : 105 : /. • / 

(i) SEQUENCE CHARACTERISTICS : .\ _ 

(A) LENGTH: 1083 base pairs '/ ^ 
".(B) .TYPE : nucleic acid 

(C) iSTRANDEDNESS : single ', \ r . • <: . 
30 (D) TOPOLOGY: linear / 

• (ii) MOLECULE TYPE: DNA (genomic) - ; • 

: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:105: ' 
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ATGATTCTCA 


ACTCTTCTAC 


TGAAGATGGT 


ATT AAAAG AA 


TCCAAGATGA 


TTGTCCCAAA 


60 


GCTGGAAGGC 


ATAATTACAT 


ATTTGTCATG 


ATTCCTACTT 


TATACAGTAT 


CATCTTTGTG 


120 


GTGGGAATAT 


TTGGAAACAG 


CTTGGTGGTG 


ATAGTCATTT 


ACTTTTATAT 


GAAGCTGAAG 


180 


ACTGTGGCCA 


GTGTTTTTCT 


TTTGAATTTA 


GCACTGGCTG 


ACTTATGCTT 


TTTACTGACT 


240 


TTGCCACTAT 


GGGCTGTCTA 


CAC AGCTATG 


GAATACCGCT 


GGCCCTTTGG 


CAATTACCTA 


300 


TGTAAGATTG 


CTTCAGCCAG 


CGTCAGTTTC 


AACCTGTAGG 


CTAGTGTGTT 


TCTACTCACG 


360 


TGTCTCAGCA 


TTGATCGATA 


CCTGGCTATT 


GTTCACCCAA 


TGAAGTCCCG 


CCTTCGACGC . 


420 


ACAATGCTTG 


TAGCCAAAGT 


CACCTGCATC 


ATCATTTGGC 


TGCTGGCAGG 


CTTGGCCAGT 


1 480 


TTGCCAGCTA 


TAATCCATCG 


AAATGTATTT 


TTCATTGAGA 


ACACCAATAT 


TACAGTTTGT 


540 


GCTTTCCATT 


ATGAGTCCCA AAATTCAACC CTTCCGATAG GGCTGGGCCT GACCAAAAAT 


600 


ATAGTGGGTT 


TCCTGTTTCC 


TTTTCTGATC 


ATTCTTACAA GTTATACTCT 


TATTTGGAAG 


660 


GCCCTAAAGA 


AGGCTTATGA 


AATTCAGAAG 


AACAAACCAA 


GAAATGATGA 


TATTTTTAAG 


720 


ATAATTATGG 


CAGCAATTGT 


GCTTTTCTTT 


TTCTTTTCCT 


GGATTCCCCA 


CCAAATATTC 


780 


ACTTTTCTGG 


ATGTATTGAT 


TCAACTAGGC -ATCATACGTG . 


ACTGTAGAAT 


TGCAGATATT 


840 


GTGGACACGG 


CCATGCCTAT 


CACCATTTGT 


ATAGCTTATT 


TTAACAATTG 


CCTGAATCCT 


900 


CTTTTTTATG 


GCTTTCTGGG 


GAAAAAATTT 


AAAAGATATT 


TTCTCCAGCT 


TCTAAAATAT . 


960 


ATTCCCCCAA 


AAGCCAAATC 


CCACTCAAAC 


CTTTCAACAA 


AAATGAGCAC 


GCTTTCCTAC 


1020 


CGCCCCTCAG 


ATAATGTAAG 


CTCATCCACC 


AAGAAGCCTG 


CACCATGTTT 


TGAGGTTGAG 


1080 



TGA 1083 

20 (107) INFORMATION FOR SEQ ID NO: 106: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 360 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

25 (D) TOPOLOGY: not relevant, 

* v (ii) MOLECULE TYPE: protein : 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 106: 

Met -lie Leu Asn Ser Ser Thr Glu Asp Gly lie Lys Arg He Gin Asp 
1 5 10 ' 15 

30 Asp Cys Pro Lys Ala ,Gly Arg -His Asn Tyr lie Phe Val Met He Pro 
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' " ; • ; V. 20 25 ■ ■ , ; 30 ; : /"-; . .. , 

Thr LeuTyr.Ser lie . lie Phe Vai Val : Gly lie Phe Gly Asn Ser Leu 

- ' " , • • - . 35 . /, \ • , 40 " / ■ .45: • ^ v 1 .' 

: - ~— Val -Val " lie Val He Tyr Phe : Tyr Met "Lys Leu Lys Thr Val Ala : Ser" 7 : • 

5 .-' 50 - ■"■ 55,:. 60 ' ■ "■"".///■'■■' ; 

T" r Val phe Leu Leu'Asri- Leu. Ala -Leu Ala Asp Leu" Cys Phe Leu' Leu Thr ' : " " 

. Leu Pro Leu Trp Ala Val Tyr Thr Ala Met Glu Tyr Arg Trp Pro Phe ■ 

85 . 90 ' •' /-, :• '.' ■ 95 ' • \ > . 

10 V'V Gly Asn Tyr Leu Cys' Lys lie Ala Ser Ala Ser Val Ser Phe Asn Leu ' 

v y -.. 100 V-'-"; ^ ■ 105 ■ ' no . '.v~ 

, ■ Tyr Ala Ser Val Phe Leu Leu Thr Cys Leu Ser lie Asp Arg Tyr Leu 

■ V " 115 " 120 ; ; . . • ,125 

Ala lie Val His Pro Met Lys Ser Arg Leu Arg Arg Thr Met Leu Val 
15 130 135 140 

Ala Lys - Val Thr Cys He He lie Trp Leu. Leu -Ala Gly Leu Ala Ser 
' 145 150 ; 155 160 

" Leu Pro Ala He lie His Arg Asn Val Phe Phe He Glu Asn Thr Asn ' 
' 165 170. , 175 

20 He Thr Val Cys Ala Phe His Tyr Glu Ser Gin Asn Ser Thr Leu Pro - 

180 185 190 

lie Gly Leu Gly Leu Thr. Lys Asn He Leu Gly Phe Leu Phe Pro Phe 

195 . : 200 ( ■ : . 205 

Leu He lie Leu Thr Ser Tyr Thr Leu lie Trp Lys Ala Leu Lys Lys 
25 210 215 220 

' Ala Tyr. Glu He Gin Lys Asn Lys Pro Arg Asn Asp Asp : lie Phe Lys 
225 230 , . f 235 240 . 

He lie Met Ala Ala He. Val Leu Phe Phe Phe Phe -Ser Trp lie Pro /:*'"' 

: ■ / 245 250 ;v. \ ' • 255 

30 His Gin lie Phe Thr- Phe Leu Asp Val Leu lie Gin -Leu Gly lie He 

260 J 265 ^ . • . 270 

.Arg Asp Cys Arg lie Ala Asp lie Val Asp Thr Ala Met Pro lie Thr 

275 . 280 285 \ \ 

; lie Cys He Ala Tyr Phe Asn Asn Cys Leu Asn Pro Leu Phe Tyr Gly 
35 -290 295 . , 300 ■ / . . : 

Phe Leu Gly . Lys Lys Phe Lys Arg Tyr Phe Leu Gin Leu Leu Lys Tyr 
, 305 310 P15 320 
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Ile Pro Pro Lys Ala -Lys Ser His Ser Asn Leu Ser Thr Lys Met Ser 
325 330 335 

Thr Leu Ser Tyr Arg Pro Ser Asp Asn Val Ser Ser Ser Thr Lys Lys 
340 345 350 

5 - Pro Ala Pro Cys Phe Glu Val Glu 

355 360 

(108) INFORMATION FOR SEQ ID NO:107i 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs " . 

10 , (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single. 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iv) ANTI-SENSE: NO 



15 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:107: 

CCCAAGCTTC CCCAGGTGTA TTTGAT 26 

(109) INFORMATION FOR SEQ ID NO:108: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 38 base pairs 
20 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

. (ii) MOLECULE TYPE: DNA (genomic) 
- (iv) ANTI- SENSE: YES 

25 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 108: - 

AAGCACAATT GCTGCATAAT TATCTTAAAA ATATCATC 38 

(110) INFORMATION FOR SEQ ID NO: 109: 

(i) SEQUENCE CHARACTERISTICS: ■ ' . ^ 

(A) LENGTH: 3 9 base pairs 
30 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

* (ii) „ MOLECULE TYPE: DNA (genomic) 
(iv) ANTI -SENSE: NO 



26 
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' . ' "(xi) ' SEQUENCE DESCRIPTION: SEQ ID NO: 109: y''-' .C' , • Y. 

.AAGATAATTA TGGCAGCAAT TGTGCTTTTC TTTTTCTTT ' 39 

(111) INFORMATION FOR SEQ ID NO : 110 : • 

( i ) SEQUENCE CHARAC^ ' , : " ~ ~- • r ~ ~.^' r 

, (A) LENGTH: 26 base pairs 
: (B) TYPE: nucleic acid- ' *•' *• •. • - ; - \ 

— ' (C)~ STRANDEDNESS : : single ' - ' . - ^ / 7" 

(D) TOPOLOGY: linear ■ 

(ii) MOLECULE TYPE: DNA (genomic) ; V 

(iv).ANTI- SENSE: YES ' : \ • . \. ' • ■ ^ 

; (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 110: 
GTTGGATCCA CATAATGCAT TTTCTC ' : " ". 

(112) INFORMATION FOR SEQ ID NO: 111 : 

.* (i) SEQUENCE CHARACTERISTICS: •- : \ 

(A) LENGTH: 1344 base pairs : 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single :\ ; : 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 111 : > 
ATGGAGCTGC TAAAGCTGAA CCGGAGCGTG CAGGGAACCG GACCCGGGCC GGGGGCTTCC 
CTGTGCCC3CC CGGGGGCGCC TCTCCTCAAC AGCAGCAGTG TGGGCAACCT CAGCTGCGAG 
CCCCCTCGCA TTCGCGGAGC CGGGACACGA GAATTGGAGC TGGCCATTAG AATCACTCTT 
TACGCAGTGA TCTTCCTGAT GAGCGTTGGA GGAAATATGC TCATCATCGT GGTCCTGGGA 
CTGAGCCGCC GCCTGAGGAC TGTCACCAAT GCCTTCCTCC TCTCACTGGC AGTCAGCGAC 
CTCCTGCTGG CTGTGGCTTG CATGCCCTTC ACCCTCCTGC CCAATCTCAT GGGCACATTC 360 
ATCTTTGGCA CCGTCATCTG CAAGGCGGTT TCCTACCTCA "tGGGGGTGTC TGTGAGTGTG 
TCCACGCTAA GCCTCGTGGC CATCGCACTG GAG CG AT ATA GCGCCATCTG CCGACCACTG 
CAGGCACGAG TGTGGCAGAC. GCGCTCCCAC GCGGCTCGCG TGATTGTAGC CACGTGGCTG 
CTGTCCGGAC TACTCATGGT "GCCCTACCCC GTGTACACTG TCGTGCAACC AGTGGGGCCT 
CGTGTGCTGC AGTGCGTGCA TCGCTGGCCC AGTGCGCGGG TCCGCCAGAC CTGGTCCGTA 



60 
120 
180 
240 
300 



420 
480 
540 
600 
660 
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CTGCTGCTTC TGCTCTTGTT CTTCATCCCA GGTGTGGTTA TGGCCGTGGC CTACGGGCTT 720 

ATCTCTCGCG AGCTCTACTT AGGGCTTCGC TTTGACGGCG ACAGTGACAG CGACAGCCAA 780 

AGCAGGGTCC GAAACCAAGG CGGGCTGCCA GGGGCTGTTC -ACCAGAACGG GCGTTGCCGG 840 

:■ CCTGAGACTG GCGCGGTTGG CAAAGACAGC GATGGCTGCT ACGTGCAACT TCCACGTTCC . 900 

5 CGGCCTGCCC TGGAGCTGAC GGCGCTGACG GCTCCTGGGC CGGGATCCGG CTCCGGGCGC 960 

ACCCAGGCCA AGCTGCTGGC TAAGAAGCGC GTGAAACGAA TGTTGCTGGT , GATCGTTGTG 1020 

CTTTTTTTTC TGTGTTGGTT , GCCAGTTTAT AGTGCCAACA CGTGGCGCGC CTTTGATGGC 1080 

CCGGGTGCAC ACCGAGCACT CTCGGGTGCT CCTATCTCCT TCATTCACTT GCTGAGCTAC 1140 

' GCCTCGGCCT GTGTCAACCC CCTGGTCTAC TGCTTCATGC AGCGTCGCTT TCGCCAGGCC 1200 

10 TGCCTGGAAA CTTGCGCTCG CTGCTGCCCC CGGCCTCCAC GAGCTCGCCC CAGGGCTCTT .1260 

CCCGATGAGG ACCCTCCCAC TCCCTCCATT GCTTCGCTGT CCAGGCTTAG" CTACACCACC .. 1320 

ATCAGCACAC TGGGCCCTGG CTGA . 1344 

(113) INFORMATION FOR SEQ ID NO:112: . — 

(i) SEQUENCE CHARACTERISTICS:: .V- 
15 (A) LENGTH: 447 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE :. protein 

20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 112: 

Met' Glu Leu Leu Lys Leu Asn Arg Ser Val Gin Gly Thr Gly Pro Gly 
1 5 10 15 

, Pro Gly Ala Ser Leu Cys Arg Pro Gly Ala Pro Leu Leu Asn Ser Ser 
20 25 30 

25 ' Ser Val Gly Asn Leu Ser Cys Glu Pro Pro Arg lie Arg Gly Ala Gly 

. 35 40 45 , 

Thr Arg Glu Leu Glu Leu Ala lie, Arg lie Thr. Leu Tyr Ala Val lie 
50 55 60 

Phe Leu Met Ser Val Gly Gly Asn Met Leu lie He Val Val Leu Gly 
30 65 70 75 80 

. Leu, Ser Arg Arg Leu Arg Thr .Val Thr Asn Ala Phe Leu Leu Ser Leu 
85 90 95 
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Ala Val Ser Asp' Leu Leu Leu Ala Val Ala Cys Met" Pro Phe Thr Leu 
.v.,". 10 ° , 1.05 , : 110 

'Leu Pro Asn" Leu Met Gly Thr Phe lie Phe Gly Thr Val lie Cys Lys 

' 115 120 . .... 125 

' Ala Val Ser Tyr Leu Met Gly Val Ser' Val Ser Val Ser Thr Leu Ser 

. . . .... 130 . .. . .. . • . 135 . . 140 . 

. Leu Val Ala lie Ala Leu Glu Arg Tyr. Ser. Ala lie Cys Arg Pro Leu 

' > 45 ' . • . 150 , ... 155 . : ' ,.,-f ;: 160 

Gin Ala Arg Val Trp Gin Thr Arg Ser His Ala Ala Arg Val lie Val 

■-' : 165 ■' 170 . 175 

Ala Thr Trp Leu Leu Ser Gly Leu Leu Met Val Pro Tyr Pro Val Tyr 



180 



185 



190 



15 



20 



25 



Thr Val. Val Gin Pro Val Gly Pro Arg Val Leu Gin Cys Val His Arg 

195 "■■ .- 200 ' 205 -.. 

Trp Pro Ser Ala Arg Val Arg Gin Thr Trp Ser Val Leu Leu Leu Leu 
210 215 220 • 

Leu Leu Phe Phe lie Pro Gly Val Val Met Ala Val Ala Tyr Gly Leu 

' 225 • 230 . 235 -,' ; 240 

;Ile Ser Arg Glu Leu Tyr Leu Gly Leu Arg Phe Asp Gly Asp Ser Asp 
• 245 250 '■ 255 

Ser Asp Ser Gin Ser Arg Val Arg Asn Gin. Gly Gly Leu Pro. Gly Ala 
260 • ' 265 270 

Val His Gin Asn Gly Arg Cys Arg Pro Glu Thr Gly Ala Val Gly Lys 

■ 275 , • 280' ■ 285 

Asp Ser Asp Gly Cys Tyr Val Gin Leu Pro Arg Ser Arg Pro Ala Leu 



290 



295 



300 



30 



Glu Leu Thr Ala Leu Thr Ala Pro Gly Pro Gly Ser Gly Ser Arq Pro 

305 310 315 ■ . 320 

, Thr Gln Ala Lys Leu Leu Ala. Lys Lys Arg Val Lys Arg Met. Leu Leu 

.325 . 330 335 -v.- 

.Val lie Val Val Leu Phe Phe Leu Cys Trp Leu Pro Val Tyr Ser Ala 

340 345 ' 350 

; Asn Thr Trp Arg. Ala Phe Asp Gly Pro Gly Ala His Arg Ala Leu Ser 

355 360 365 



35. 



Val Ala Pro He Ser Phe He His Leu Leu Ser Tyr Ala Ser Ala 



370 



Cys 



. - . ' 375-.. . :. ^ .:. 380 • 

Val Asn Pro Leu Val Tyr Cys Phe Met His Arg Arg Phe Arg Gin Ala 
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385 , . 390 ... . 395 400 

Cys Leu Glu Thr Cys Ala Arg Cys Cys Pro Arg Pro Pro Arg Ala Arg 
405 ,410 . 415 . 

Pro Arg Ala Leu Pro Asp Glu Asp Pro Pro Thr Pro Ser I-le Ala Ser 
5 420 425 430 ? 

Leu Ser Arg Leu Ser Tyr Thr Thr lie Ser Thr Leu Gly Pro Gly 
435 440 445 

(114) INFORMATION FOR SEQ ID NO: 113: 

(i) SEQUENCE CHARACTERISTICS : 
10 (A) LENGTH: 34 base pairs 

(B) TYPE: nucleic acid 

(C) - STRANDEDNESS: single 

(D) TOPOLOGY: linear _ . 
(ii) MOLECULE TYPE: DNA (genomic) 

15 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 113: 

CAGCAGCATG CGCTTCACGC GCTTCTTAGC CCAG 34 

(115) INFORMATION FOR SEQ ID N0:114: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH:. 33 base pairs 
20 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:114: 
25 AGAAGCGCGT GAAGCGCATG CTGCTGGTGA TCGTT '35 

(116) INFORMATION FOR SEQ ID NO:115: / 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid - * 
30 (C) STRANDEDNESS: single 

' '(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 115 : 
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- ATGGAGAAAA GAATCAAAAG AATGTTCTAT ATA .-. 

'" (117) INFORMATION FOR SEQ ID NO: 116: * 

. . ,(i) SEQUENCE CHARACTERISTICS: ' 
'J •'. (A) -LENGTH :— 3 3— base~pairs~ - -'■ '-- 

5 (B) TYPE: nucleic acid 

'.(C) STRANDEDNESS: single' 
: : w (D) - TOPOLOGY-: • linear' ' * y - ~ : : "'-' ' " 

(ii) MOLECULE TYPE: DNA (genomic) 

(iv) ANTI-SENSE: YES 

10 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 116: 

TATATAGAAC ATTCTTTTGA TTCTTTTCTC CAT 

. ' (Ii8) INFORMATION FOR SEQ ID NO: 117 : ' , 

. ' (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 
■ . < B > TYPE: nucleic acid 

(C) STRANDEDNESS: single 
; (D) TOPOLOGY: linear . 

. ' ; (ii) MOLECULE TYPE: DNA (genomic) • 

(iv) ANTI- SENSE: NO 



PCT/US99/24065 . 



33 



20 



25 



30 



(Xi). SEQUENCE DESCRIPTION: SEQ ID NO: 117: 
CGCTCTCTGG CCTTGAAGCG CACGCTCAGC ■ " 

(119) INFORMATION FOR SEQ ID NO: 118: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) , ''' 
(iv) ANTI-SENSE: YES 

' (Xi) SEQUENCE DESCRIPTION:' SEQ ID NO: 118:. 
GCTGAGCGTG CGCTTCAAGG CCAGAGAGCG ; . 
(120) INFORMATION FOR SEQ ID NO : 119 : *7 '"' 



33 



30 



30 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS: single 
, 5 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : ; DNA (genomic) 

(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 119: 
CCCAGGAAAA AGGTGAAAGT CAAAGTTTTC " 30 

10 (121) INFORMATION FOR. SEQ ID NO: 120 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid . 

(C) STRANDEDNESS: single 

15 (D) TOPOLOGY: linear . - 

(ii) MOLECULE TYPE: DNA (genomic) 

(iv) ANTI-SENSE: YES • 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:120: «' 
GAAAACTTTG ACTTTCACCT TTTTCCTGGG . 3 0 

20 (122) INFORMATION FOR SEQ ID NO: 121 : 

(i) SEQUENCE CHARACTERISTICS-: * ■ : 

(A) LENGTH: 2 7 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
25 - (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 121 : 
GGGGCGCGGG TGAAACGGCT GGTGAGC , ' 27 

30 (123) INFORMATION FOR SEQ ID NO:122: 

(i)-- SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid * 

(C) STRANDEDNESS: single 
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(D) TOPOLOGY: linear / 

MOLECULE TYPE: DNA. (genomic) /V 
V ' , (* v ) ANTI-SENSE: YES — •, - - - 

. (Xi) ^Q^ DESCRXPTIONr-SEQ 'lD NO, 122 : 
. 5 . GCTCACCAGC. CGTTTCACCC. GCGCCCC 

(124) INFORMATION FOR SEQ ID NO: 123: 

SEQUENCE CHARACTERISTICS- " 
(A) LENGTH: 30 base pairs 
B TYPE: nucleic acid 
. (C) STRANDEDNESS: single 
• \ (D) TOPOLOGY : linear 

. V (ii) MOLECULE TYPE: DNA (genomic) 
.. (iv) ANTI-SENSE: NO ' 

(Xi) SEQUENCE DESCRIPTION': SEQ ID NO: 123 : 
15 CCCCTTGAAA AGCCTAAGAA CTTGGTCATC " 
..' (125) INFORMATION FOR SEQ ID NO: 124: ' 

(i) SEQUENCE CHARACTERISTICS- """ 
(A) LENGTH : 30 base. pairs 
20 (B) TYPE .- nucleic acid 

' - (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear \. : ./ 

, (ii) MOLECULE TYPE: DNA . (genomic) "^-V", 
(iv) ANTI-SENSE: YES 

. . (XI) SEQUENCE DESCRIPTION: SEQ ID NO : l 24: " 
25 GATGACCAAG TTCTTAGGCT TTTCAAGGGG 

(126) INFORMATION FOR SEQ ID NO: 125 : \ . " ' ' ' ' 

') . U) SEQUENCE CHARACTERISTICS • ' 
J* } - i ENGTH: 3 2 base pairs 
■ , (B), TYPE: nucleic acid 

(C) STRANDEDNESS: single 
. .: (P) TOPOLOGY, .linear. _ ' : , ■/ 
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(iv) ANTI- SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:125: 
GATCTCTAGA ATGAACAGCA CATGTATTGA AG " ' " . 32 

(127) INFORMATION FOR SEQ ID NO:126: 

5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 35 base pairs 

(B) TYPE: nucleic acid ' - 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

10 (ii) MOLECULE TYPE : DNA (genomic) 

(iv) ANTI-SENSE: YES 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:126: 
CTAGGGTACC CGCTCAAGGA CCTCTAATTC CATAG 35 

(128) INFORMATION FOR SEQ ID NO:127: 

15 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1296 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

20 (ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:127: 

ATGCAGGCGC TTAACATTAC CCCGGAGCAG TTCTCTCGGC TGCTGCGGGA CCACAACCTG 60 

ACGCGGGAGC AGTTCATCGC TCTGTACCGG CTGCGACCGC TCGTCTACAC CCCAGAGCTG 120 

CCGGGACGCG CCAAGCTGGC CCTCGTGCTC ACCGGCGTGC TCATCTTCGC CCTGGCGCTC 180 

25 TTTGGCAATG CTCTGGTGTT CTACGTGGTG ACCCGCAGCA AGGCCATGCG CACCGTCACC 240 

AACATCTTTA TCTGCTCCTT GGCGCTCAGT GACCTGCTCA TCACCTTCTT CTGCATTCCC 300 

GTCACCATGC TCCAGAACAT TTCCGACAAC TGGCTGGGGG G TG CTTTCAT TTG CAAGATG 360 

GTGCCATTTG TCCAGTCTAC CGCTGTTGTG ACAGAAATGC TCACTATGAC CTGCATTGCT 420 

GTGGAAAGGC ACCAGGGACT TGTGCATCCT TTTAAAATGA AGTGGCAATA CACCAACCGA 480 
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720 
780 
840 
900 



. ™ m ~0 „CAC TC?,TCJTAS3 ATCACCCATO 

■ ■ : — - —OA GATCAAATAT GACTTCCTAT ATGAAAAGGA. ACACATCTGC 
7 ^ ^^TAGAAG "^''^'"^Ag" CCCTGTGCAC * CAGAASATCT ACACCACCT7 CATCCT^STC^ 

™ ~ — ~™ — ^ 

«^AAAA TAGCCA3GAA* GAAGAAACGA CCT„GAT7A TGATGGTOAC AGTGGT3GCT 

™ — — — ^ ATACA3TAAT 

•" 0G AAGA7GATTT TTGCTATCGT gcAAA^TT sso 

»«CATCTG TAATCCCATT GTCTATGCAT TTATGAATGA 

^ ™~ a^ a32:^ ,:■ - 

, , ^scArGGAA 'attcaggaat' tacaatgatg' cggaagaaag caaag^c ™ i08 ° 

- "c,_g TO a~ :i: 

,., — — — CTC AAACGACATC 1200 
;. : GTGGCTGAGA ATTCTCCTT7 AtlACAGTGGG CATTAA \'v ' " 

15 (129) INFORMATION ' FOR SEQ ID Nb:128: ; 

W SEQUENCE CHARACTERISTICS. 

ifi'= rai " l acid, 
IB) TYPE: amino acid ■■ 

20 (C) STRANDEDNESS • " ' >. - 

»! T0F0L03V; not tel.vant ■" ': 

.; lii) MOLECCU! TVFE, prot.ir. " ' ■ ' : " 

M) sequence — escriptiokj se 0 :e t;0 , :23! ; \ v 

Met Gin Ala Leu Asn i; 



30 



eThrPrO G1U t 1 " ^ ^ Arg Leu ^ UArg 

25 . ..Asp Hi, Asn. Leu Thr Arg G l u Gin Phe lie Ala T - 

• 20 . - ' ■ 25 Ala . Leu Tyr Arg Leu Arg 

■ .30 

- ~ - - - ^ :Pro Gly ^ y ;ou au ^ 

60 

' ' 80 
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Asn lie Phe lie Cys Ser Leu Ala Leu Ser Asp Leu Leu lie Thr Phe 
85 90 95 

Phe Cys lie Pro Val Thr Met Leu Gin Asn' lie Ser Asp Asn Trp Leu 
100 " 105 : 110 

.Gly Gly Ala. Phe lie Cys Lys Met Val Pro Phe Val Gin Ser Thr Ala 
115 120 125 

Val Val Thr Glu Met Leu Thr Met Thr Cys lie Ala Val Glu Arg His 
130 135 . 140 



10 



Gin Gly Leu Val His Pro Phe Lys Met Lys Trp Gin . Tyr Thr Asn Arg 
14 5 150 155 160 



Arg Ala Phe Thr Met Leu Gly Val Val Trp Leu Val Ala Val He Val 
165 170 175 

Gly Ser Pro Met Trp His' Val Gin, Gin, Leu Glu He Lys Tyr Asp Phe 
180 185 190 



15 



Leu . Tyr Glu Lys Glu His He Cys Cys Leu Glu Glu Trp Thr Ser Pro 
195 200 205 



Val His Gin Lys lie Tyr Thr Thr Phe He Leu Val He Leu Phe Leu 
210 215 . 220 



20 



Leu Pro Leu Met Val Met Leu lie Leu Tyr Ser Lys lie Gly Tyr Glu 
225 230 235 240 



Leu Trp He Lys Lys Arg Val Gly Asp Gly Ser Val Leu Arg Thr He 
245 250 255 



His Gly Lys Glu Met Ser Lys lie Ala Arg Lys Lys Lys Arg Ala Lys 
260 265 . 270 



25 



He Met Met Val Thr Val Val Ala Leu Phe Ala Val Cys Trp Ala Pro 
275 280 285 



Phe His Val Val His Met Met He Glu Tyr Ser Asn Phe Glu Lys Glu 
290 295 300 

- Tyr Asp Asp Val Thr lie Lys Met He Phe Ala He Val Gin He He 
30 305 310 315 320 

Gly Phe Ser Asn Ser lie Cys Asn Pro lie Val Tyr Ala Phe Met Asn 
325 330 335 

Glu Asn Phe Lys Lys Asn Val Leu Ser- Ala Val Cys Tyr Cys He Val 
340 345 350 



35 



Asn -Lys Thr Phe Ser Pro Ala Gin Arg His Gly Asn Ser Gly He Thr 
355 360 .365 



Met Met Arg Lys Lys Ala Lys Phe Ser Leu Arg Glu Asn Pro Val Glu 
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5 



. • , ■ 395 ' *: : ■■■ .4oo 

: ^ C "> «» Lys L y s l«, Lys : sr? His . _ 

- ... ' - ' . - ' 410 ■ 415 ■ 

. "• . ■' • 425 ...v - 430 

(130) INFORMATION FOR SEQ ID NO: 129:' . -/ 



. (i) SEQUENCE CHARACTERISTICS 



(A) LENGTH: 2040 base pairs 

(B) TYPE: nucleic acid 

IC) srp.A.'OEaiiSSS: single . 

<D> TOPOLOGY, linear 

, / ; : (ii). MOLECULE TYPE, DBA- Cger.CT.ic) - ^ 

" '"il SEQtreHCE DESCRIPTION: SEQ ID B0.«9, i 

ATGGGCAGCC CC7GGAACGG CAGCGACGGC CCCOACCCCG CGCCCCAGCC GCCGTCGCCC 
... V ? -CTGCCGC CTTCCCACGA GCGCCGCTOC TCCCCCTTTC CCCTGGGGGC GCTCGTCCCG 
20 «f CC0CTO TCm ° CCTGTTCGTC GTCGGGGTGA GCGGCAACGT GGTGACCGTG ' 
: . ■ ■ ATGCTGATCG GGCGCTACCG GGACATGCGG ACCACCACCA ACT.GTACCT GGGCAGCATQ ^ 
^ GCCGTOTCCG ACCTACTCAT CCTCCTCGGC CTGCCCTTCG ACCTGTACCG CC^GCGC 
XCGCGOCCCT GGCTGTTCCG GCCGCTGCTC TGCCGCC7GT CCCTCTACGT GGGCCAGGGC 
30 4 T ;CTaC0 raC ° CTOT OCGC-CAGCG TCGAGCGCTA CCTGCCCATC 

^ ' '. ^^^^"^ ' ^^-OCGCCCG CGTCTTGGTC. ACCCGGCGCC GCGTCCGCGC 'GCTCATCGCT ■ ' ' 
35 ST"* CCGTG3CGCT GCTCTCTGCC GGTCCCTTCT TGTTCCTGGT ^GGGCGTCGAG ' .. 

CAGGACCCCG GCATCC.-GT AGTCCCGG=C CTCAArCGCA CCGCGC, 3AI CCiCCX'CTCG 
40 CCTCTCGCCT CGTCGCCGCC TCTC^GCTC TCGCGGGCGC «« CCC,CCGTCG 
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660 

* GGGCCCGAGA CCGCGGAGGC CGCGGCGCTG TTCAGCCGCG AATGCCGGCC GAGCCCCGCG 
720 

5 ; 1 . .... . • . 

GAGCTGGGCG CGCTGCGTGT CATGCTGTGG GTCACCACCG CCTACTTCTT CCTGCCCTTT 
780 ; . 

. CTGTGCCTCA GCATCCTCTA CGGGCTCATC GGGCGGGAGC TGTGGAGCAG CCGGCGGCCG 
10 840 

CTGCGAGGCC CGGCCGCCTC GGGGCpGGAG AGAGGCCACC GGCAGACCAA ACGCGTCCTG 
900 

15 CGTAAGTGGA GCCGCCGTGG TTCCAAAGAC GCCTGCCTGC AGTCCGCCCC GCCGGGGACC 
960 • 

GCGCAAACGC TGGGTCCCCT TCCCCTGCTC GCCCAGCTCT GGGCGCCGGT TCCAGCTCCC 
1020 

20 

TTTCCTATTT CGATTCCAGC CTCCACCCGC CGGTACTTCC CATCCCCCGA GAAAACCATG 
1080 

TCCTGTCCCC CAGGAGCTCT GGGGGACCCC AGGGCGCTTT GAGGGTGGGA TCCCCGGATC 
25 1140 \ - 

CGATTCAGTA ACCAGCAGTG CTTTTCCAGA GCCTCTGAGA CCAGAAAGGA GAGTTGGTAA 
1200 

30 TTGTTAATCC AACCACCTGT TAGATGCCAC AAATGAGGAG TCCTCACAGT GCTCTTGAGA 

1260 , ^ ' . . 

AGACGAGGGA GATTTCATTA AGCTAAAATT TTTTATTTAA TGTTAAGTGA TGCTGAAGGC 
1320 

35 ' t ■ . . . - ■ ' 

TAAAGTAAAC CTTGCTCGTA TCAAAAAGTA AAGATTGTGC AGACCTGTTG TAGAATTCTT 
1380. 

TTCAACAGAG AACAGAAAAC TTGTCTCCGA AGTGGGTTTG TGGAAGGAAG CCTGCCAAGG 
40 1440 

CGGCTTTGTTC AGAGAAATTG CTCCTTCTGG TTTATGTCCA GCCTTGATAA C ACATATGGG 
1500 

45 AGCCTACTAT GCAGTTTTAA AGCAAGTATC CATGCAGCCT GCAGCCTGGT CATTTTTTCT 

1560 . : / •• 

GGGGTGAGGA TCTGCCTAGG TAGAAGTTTT CTCTAATTTA TTTTGCTGTT ACTTGTTATT 
1620 

50 

GCAGATGGTT CCTTGTCGGG GTGGGGGGTT TATTTGCTTC CCAATGCTTT TGTTAATCCC 
1680 

GGTGCTGTGT CTTATGTTGC AGTGGTGGTG GTTCTGGCAT TTATAATTTG CTGGTTGCCC 
55 1740 
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TTCCACGTTG GCAGAATCAT TTACATAAAC ACGGAAGATT CGCGGATGAT GTACTTCTCT ' 
: 5 CAGTACTTTA ACATCGTCGC TCTGCAACTT TTCTATCTGA GCGCATCTAT CAACCCAATC ■ ■ 
;; . CTCTACAAC.C TCATTTCAAA GAAGTACAGA GCGGCGGCCT TTAAACTGCT GCTCGCAAGG ^ 
. ^JCCAGGC CGAGAGGCTT CCACAG AAGC AGGGACACTG CGGGGGAAGT TGCAGGGGAC , ' 
- 15 :^ GAGGAG ACACGGTGGG CTACACCGAG ACAAGCGCTA ACGTGAAGAC GATGGGATAA [ ' V 

(131) INFORMATION FOR SEQ ID NO.-130: 

• (i) SEQUENCE CHARACTERISTICS : " - " 
(A) LENGTH: 412 amino acids 
n (B) TYPE: amino acid . ,.. : , ' , 

20 (C) STRANDEDNESS : • ' : : . ■ ' ; ' 

(D) TOPOLOGY: not relevant / • ' 

(ii) MOLECULE TYPE: protein - '/•. ' . "* ■ 

... (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 130 : . - 

25 v ; : Pr °^ rP Asn ^ fly Pro Glu Gly Ala Arg Glu- ^ 



30 



35 



40 



10 



15 

Pro Pro Trp Pro Ala Leu Pro Pro Cys Asp Glu Arg Arg Cys Ser Pro 



25 



30 

Phe Pro Leu Gly Ala Leu Val Pro Val^ Tnr Ala Val Cys Leu Cys Leu 

... 35 • ■ ■ < 40 .. .• ';45 >. - ■; ;■ 

Phe Valval Gly Val Ser Gly Asn Val Val Thr Val Met Leu lie Gly 

: 55 .■• . 60 ' ■ .. .- 

Arg Tyr Arg A,p. M et Arg Thr Thr Thr Asn Leu Tyr Leu Gly Ser Met 

■ ■' : - :: 70 v ." ' 75 so' 

Ala Val Ser Asp Leu Leu lie Leu Leu Gly Leu Pro . Phe Asp' Leu Tyr 
Arg Leu Trp' Arg Ser Arg Pro Trp Val; Phe Gly . Pro Leu . Leu Cys Arg 



95 

2r Arg Pro Trp Val. Phe Glv Pro t.«, 

.100. ' 

Leu Ser Leu Tyr Val Gly Glu Gly: Cys Thr Tyr Ala Thr 



115 ' 7 a Thr Leu Leu His 

"* t S «« «« val «» A.- g ^ Lea AU „. cys ^ Pro U. 



135 



140 
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Arg Ala Arg Val Leu Val Thr Arg Arg Arg Val Arg Ala Leu He Ala 
145 150 155 160 

Val Leu Trp Ala" Val Ala Leu Leu Ser Ala Gly Pro Phe Leu Phe Leu 
165 170 175 

5 Val Gly Val Glu Gin Asp Pro Gly lie Ser Val Val Pro Gly Leu Asn 

180 185 190 

Gly Thr Ala Arg lie Ala Ser Ser Pro Leu Ala Ser Ser Pro Pro Leu 
195 * 200 205 

Trp Leu Ser Arg Ala Pro Pro Pro Ser Pro Pro Ser Gly Pro Glu Thr 
10 210 215 . . - : 220 * 

Ala Glu Ala Ala Ala Leu Phe Ser Arg Glu Cys Arg Pro Ser Pro Ala 
225 ,230 235 240. 

Gin Leu Gly Ala. Leu Arg Val Met Leu Trp Val Thr Thr Ala Tyr Phe 
245 250 255 

15 Phe Leu Pro Phe Leu Cys Leu Ser He Leu Tyr Gly Leu He Gly Arg 

260 265 270 

Glu Leu Trp Ser Ser Arg Arg Pro Leu Arg Gly, Pro Ala Ala Ser Gly 
275 280 285 

Arg Glu. Arg Gly His Arg Gin thr Lys Arg Val Leu Leu Val Val Val 
20 290 295 300 

Leu Ala Phe He He Cys Trp Leu -Pro Phe His Val Gly Arg lie He 
305 310 315 320 

Tyr lie Asn Thr Glu Asp Ser Arg Met Met Tyr Phe Ser Gin Tyr Phe 
325 330 335 

25 Asn He Val Ala Leu Gin Leu Phe Tyr Leu Ser Ala Ser He Asn Pro 

.340 345 350 

lie Leu Tyr Asn Leu He Ser Lys Lys Tyr Arg . Ala Ala Ala Phe Lys 
355 360 365 

Leu Leu Leu Ala Arg Lys Ser Arg Pro Arg Gly Phe His Arg Ser Arg 
30 370 375 380 

Asp Thr Ala Gly Glu Val Ala 'bly Asp Thr Gly Gly Asp Thr Val Gly 
385 390 395 400 

Tyr Thr Glu Thr Ser Ala Asn Val Lys Thr Met Gly 
405 410 

35 (132) INFORMATION FOR SEQ ID NO: 131 i. '.. ' 



(i) SEQUENCE CHARACTERISTICS: 

'(A} LENGTH: 1344 base pairs 
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. ' '/ • -99- ■ ' . 

'. (B) TYPE:' nucleic acid 

(G) STRANDEDNESS : single / ' ' ' 

(D) TOPOLOGY: linear ■/ ''' • ''"'•'■/'-' 

(ii) MOLECULE TYPE: DNA (genomic) 

•- 5 ■., (xi) SEQUENCE DESCRIPTION: . SEQ ID NO: 131: \ ? 

V ATGGAGCTGC TAAAGCTGAA "cCGGAGCGTG CAGGGAACCG GACCCGGGCC GGGGGCTTCC / 

CTGTGCCGCC CGGGGGCGCC TCTCCTCAAC AGCAGCAGTG TGGGCAACCT . CAGCTGCGAG ' 

10 ' CCCCCTCGCA TTCGCGGAGC CGGGACACGA GAATTGGAGC TGGCCATTAG AATCACTCTT 

: k 2 TACGCAGTGA tcttcctgat gagcgttgga ggaaatatgc tcatcatcgt ggtcctggga 
Vis 3"o AGCCGCC GCCTGAGGAC tgtcaccaat gccttcctcc tctcactggc agtcagcgac • 

CTCCTGCTGG CTGTGGCTTG CATGCCCTTC ACCCTCCTGC CCAATCTCAT GGGCACATTC - V 
ATCTTTGGCA CCGTCATCTG. CAAGGCGGTT TCCTACCTCA TGGGGGTGTC TGTGAGTGTG . / 
20 TCCACGCTAA GCCTCGTGGC CATCGCACTG GAGCGATATA GCGCCATCTg' CCGACCACTG ' ' 
; • . . _ CAGGCACGAG TGTGGCAGAC GCGCTCCCAC GCGGCTCGCG TGATTGTAGC CACGTGGCTG • 

2 5 ' 6 C Jo TCCGGAC TACTCATGGT :QCCCTACCCC ^OTACACTG TCGTGCAACC ; AGTGGGGCCT. \ ] ; : 
CGTGTGCTGC AGTGCGTGCA TCGCTGGCCC AGTGCGCGGG TCCGCCAGAC CTGGTCCGTA " ' 
CTGCTGCTTC TGCTCTTGTT CTTCATCCCA GGTGTGGTTA TGGCCGTGGC CTACGGGCTT V V 
30 78 T o TCTCGCG AGCTCTACTT ^^^^"cgc tttgAcggcg ACAGTGACAG CGACAGCCAA 

• AGCAGGGTCC QAAACC^ GCGTTGCCGG 

35 . , C 00 GAGACTG GCGCGGTTGG CAAA ^CAGC. GATGGCTGCT ACGTGCAACT TCCACGTTCC 

• CGGCCTGCCC TGGAGCTGAC GGCGCTGACG : GCT CCTGGGC CGGGATCCGG CTCCCGGCCC ^V'/ 
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960 

ACCCAGGCCA AGCTGCTGGC TAAGAAGCGC GTGAAACGAA TGTTGCTGGT GATCGTTGTG 

1020 1. ' 

CTTTTTTTTC TGTGTTGGTT GCCAGTTTAT AGTGCCAACA CGTGGCGCGC CTTTGATGGC 
1080 

CCGGGTGCAC ACCGAGCACT CTCGGGTGGT CCTATCTCCT TCATTCACTT GCTGAGCTAC 
1140 • ; .. 

GCCTCGGCCT GTGTCAACCC CCTGGTCTAC TGCTTCATGC ACCGTCGCTT TCGCCAGGCC 
1200 .. . . . , . 

TGCCTGGAAA CTTGCGCTCG CTGCTGCCCC CGGCCTCCAC GAGCTCGCCC CAGGGCTCTT 
1260 

CCCGATGAGG ACCCTCCCAC TCCCTCCATT GCTTCGCTGT CCAGGCTTAG CTACACCACC 

1320 .. . , . 

ATCAGCACAC TGGGCCCTGG CTGA 
1344 

(133) INFORMATION FOR SEQ ID NO: 132: 

(i) SEQUENCE CHARACTERISTICS : , 

(A) LENGTH: 447 amino acids 

(B) TYPE: amino acid 

(C) S TRANDEDNE S S : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE : protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 132: 

Met Glu Leu Leu Lys Leu Asn Arg Ser Val Gin Gly Thr Gly Pro Gly 
1 5 10 ( 15 

Pro Gly Ala Ser Leu Cys Arg . Pro Gly Ala Pro Leu Leu Asn Ser Ser 
20 25 30 

Ser Val Gly Asn Leu^Ser Cys Glu Pro Pro Arg lie Arg Gly Ala Gly 
35 40 45 

Thr Arg Glu Leu Glu Leu Ala He Arg lie Thr Leu Tyr Ala Val He 
50 55 60 

Phe Leu Met Ser Val Gly Gly Asn Met Leu He He Val Val Leu Gly 
65 70 75 80 

Leu -Ser Arg Arg Leu Arg Thr Val Thr Asn Ala Phe Leu Leu Ser Leu 
85 90 95 

Ala Val Ser Asp Leu Leu Leu Ala Val. Ala Cys Met Pro Phe Thr Leu 



wo 00/22131 ■ 
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,: : ■ 155 . . : ; 160 

Gin Ala Arg Val Trp Gin Thr Arg Ser His Ala AT a a ^ 
' 165 . iin A la Arg Val He Val 

: - „ , eu 5ar 0Iy Lc3 ^ vaj p , o ^ w ^ ^ : 

. ,; ■ ; - ~ HI - - - 0ly lT „ val ^ Gln cys val ^ 

"' 200 " ■ ■-. . 205 ■ . ' 

, 5 : Irp £ r.-f > val £ »■:«»» *«. Leu L6U Leu j 

3 : 220 

^ ^ ao Mlc Uo Gly vai va; Met Ala ^ ^^^ Leu • 

' " '. . ;23S . .;. . .240 

lie Ser Arg Glu Leu Tvr t,p„ n„ T 

. 245 7 L6U Ar9 ' Phe As P G1 y Asp Ser Asp , 

. '[■..:■■•'■ ' ' v . ' ■ .' :■. 250 ■. .-.< ■ • ■ , ■ 255- . 

20 1 Ser Asp Ser Gin Ser a™ v=i * 

260 9 ^ J!" Gln <»y Pro Gly Ala 

■ : 26? :■. 270 . 

-1 H is ^ ie „ Gly ^ ^ p r „. Glu Thr GIy ua : val Gly t s 

2J *n> £ ,, p 0:y ^ „ x gik ^ Pro 6r , ser srg pro au 

, - - - - - s ,^ 01y Pro ,, y ser Gly - :i , g . pro 

Thr Gin Ala Lys Leu Leu : Ala 

- ; ■ ; va, Ile v al v.: ..„ u Tlp Loj Iyr ^ xu ■ 

• - ; - ^ -■- 350 

- - = ly iIi5 , rs AU tol ser ; 

' . . ■ • ' 365 , . '. 

Val Ala Pro lie Ser Phe He m« T r 
5 . 370 . . • - tit HlS LeU Leu Tyr Ala Ser Ala 



310 - ■ 315 . 

, • 320 

325 ■ ' LyS LyS f 3 Val L ^ Arg Met Leu Leu 

■ 33 °' 335 



37 c — 4 J ,J - rtia ser Ala Cys 

■ • ' 380 

jal - ? .-o vax S . ^ pcs f!et ^ ^ phe ^ iu _ _ _ _ 

395 400 
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Cys Leu Glu Thr Cys Ala Arg Cys Cys Pro Arg Pro Pro Arg Ala Arg 
405 410 415 

Pro Arg Ala Leu Pro Asp Glu Asp Pro Pro Thr Pro Ser lie Ala Ser 
420 425 430 

5 • Leu Ser Arg Leu Ser Tyr Thr Thr lie Ser Thr Leu Gly Pro Gly 

435 440 .445 

(134) INFORMATION FOR SEQ ID NO: 133: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: .1014 base pairs 
10 (B) TYPE: nucleic acid ' '" ■ . 

, (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:133: 

15 ATGAACAGCA CATGTATTGA AGAACAGCAT GACCTGGATC ACTATTTGTT TCCCATTGTT 60 

TACATCTTTG TGATTATAGT CAGCATTCCA GCCAATATTG GATCTCTGTG TGTGTCTTTC 120 

CTGCAAGCAA AGAAGGAAAG TGAACTAGGA ATTTACCTCT TCAGTTTGTC ACTATCAGAT ; 180 

TTACTCTATG CATTAACTCT CCCTTTATGG ATTGATTATA CTTGGAATAA AGACAACTGG 240 

ACTTTCTCTC CTGCCTTGTG CAAAGGGAGT GCTTTTCTCA TGTACATGAA TTT.TTACAGC 300 

20 AGCACAGCAT TCCTCACCTG CATTGCCGTT GATCGGTATT TGGCTGTTGT CTACCCTTTG 360 

AAGTTTTTTT TCCTAAGGAC AAGAAGATTT GCACTCATGG TCAGCCTGTC CATCTGGATA 420 

TTGGAAACCA TCTTCAATGC TGTCATGTTG TGGGAAGATG AAACAGTTGT TGAATATTGC 480 

GATGCCGAAA AGTCTAATTT TACTTTATGC TATGACAAAT ACCCTTTAGA GAAATGGCAA 540 

ATCAACCTCA ACTTGTTCAG GACGTGTACA GGCTATGCAA TACCTTTGGT CACCATCCTG 600 

25 ATCTGTAACC GGAAAGTCTA CCAAGCTGTG CGGCACAATA AAGCCACGGA AAACAAGGAA 660 

. AAGAAGAGAA TCAAAAAACT ACTTGTCAGC ATCACAGTTA CTTTTGTCTT ATGCTTTACT 720 

CCCTTTCATG TGATGTTGCT GATTCGCTGC ATTTTAGAGC ATGCTGTGAA CTTCGAAGAC 780 

CACAGCAATT CTGGGAAGCG AACTTACACA ATGTATAGAA TCACGGTTGC ATTAACAAGT 840 

TTAAATTGTG TTGCTGATCC AATTCTGTAC TGTTTTGTTA CCGAAACAGG AAGATATGAT 900 

30 ATGTGGAATA TATTAAAATT CTGCACTGGG AGGTGTAATA CATCACAAAG ACAAAGAAAA 960 

CGCATACTTT CTGTGTCTAC AAAAGATACT ATGGAATTAG " AGGTCCTTGA GTAG 1014 
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(135) INFORMATION. FOR SEQ ID NO: 134 : ' • • ^ ' V • ' ' ' . " " 

(i) SEQUENCE CHARACTERISTICS • ' ' ^ ' 

LENGTH: 337 amino acids r— - - T - r -— 
. ..; (B.) TYPE: amino acid ■ : : ' 

(C) STRANDEDNESS : . 

r™-' ■ : ■ "(D)-.; TOPOLOGY: not relevant" ■ " ' ^^'r' ~~ . '/ " " 

(ii) MOLECULE TYPE: protein ' . 

v (Xi) SEQUENCE DESCRIPTION: SEQ ID NO:134 : ' ■ : - ^ -A--- 

• V • • ..30 ; 

Ile Ser L SU Cys val ser Phe Leu Gin *V T " 

35 • ; 4n " Ala L ^ s Lys Glu Ser Glu 

• • I . . . •. : 45 ■ - •■■ 

Leu Gly lie Tyr Leu Ph* q.v t 

50 Y . U ^ Ser Ser Asp Leu Leu Tyr Ala 

• . •. • - ' : ■■ ' 60 "... ■ . , . 



Leu Thr.'Leu Pro Tph>v^ n 

g5 . - Leu Trp He Asp. Tyr Thr Xrp Asn Lys Asp Asn Trp 

• • ■ . ' ' • " 80 

- Pro £ a L " u ;*■. W J~ -a Phe ^ Mot ^ Met 

- «» s„ te Thr Ul ?h8 ^ :u au m 2 p ^ 

.. ; " *!;* " - il0 r • 

Leu Ala val Val Tvr p^ 't/' r 

1 Tyr Pro Leu Lys Phe Phe Phe Leu Arg Thr Arg 

Arg Phe Ala Leu Met Val s*»t- t - 

130 ?JJ ^r He Trp lie Leu Glu Thr He 

- J3 140 . 

Ma ^ Met £ Ttp 01u asp Clu ^ ^ ^ ^ ^ 

155 , 160 

' Ma LI " S ^ " - ^ ^ ^ Leu ' 

• 170 : " .. ■ ; 175 , 

=lu^ Irp ^ Ile As „ ^ ^ ^ ^ Thr >ye ^ ^ ^ 

Val ^ ^ n, cye ^ ^ Q ^ 

" ." " ' :■• 205 , • 

Ala Val. Arg His Asn Lys Ala Thr m » 

21 .0 J.. 215 G1U _ Asn L yf_Glu Lys Lya Arg^lle 
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Lys Lys Leu Leu Val Ser lie Thr Val Thr Phe Val Leu Cys Phe Thr 
225 230 - 235 240 

Pro Phe His Val Met Leu Leu He Arg Cys lie Leu Glu His Ala Val 
245 250 255 

5 Asn Phe Glu Asp His Ser Asn Ser Gly Lys Arg. Thr Tyr, Thr Met Tyr 

260 / 265 270 

Arg lie Thr Val Ala Leu Thr Ser Leu Asn Cys Val Ala Asp Pro He 
275 ' 280 285 

Leu Tyr Cys Phe Val Thr Glu Thr, Gly Arg Tyr Asp Met Trp Asn He 
10 290 295 300 

Leu Lys Phe Cys Thr Gly Arg Cys Asn Thr Ser Gin Arg Gin -Arg Lys 
305 310 315 320 

Arg lie Leu Ser Val Ser Thr Lys Asp Thr Met Glu Leu Glu Val Leu 
* 325 ' 330 ' 335 

15 Glu 

(136) INFORMATION FOR SEQ ID NO:135: 

(i) SEQUENCE CHARACTERISTICS:, 

(A) LENGTH: 999 base pairs 
20 (B) TYPE: nucleic acid 

. (C) STRANDEDNESS : single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 135 : 

25 ATGGTGAACT CCACCCACCG TGGGATGCAC ACTTCTCTGC ACCTCTGGAA CCGCAGCAGT 
60 

TACAGACTGC ACAGCAATGC CAGTGAGTCC CTTGGAAAAG GCTACTCTGA TGGAGGGTGC 
120 

TACGAGCAAC TTTTTGTCTC TCCTGAGGTG TTTGTGACTC TGGGTGTCAT CAGCTTGTTG 
30 180 

GAGAATATCT TAGTGATTGT GGCAATAGCC AAGAACAAGA ATCTGCATTC ACCCATGTAC 
240 

TTTTf CATCT GCAGCTTGGC- TGTGGCTGAT ATGCTGGTGA GCGTTTCAAA TGGATCAGAA 
300 



35 ACCATTATCA TCACCCTATT AAACAGTACA GATACGGATG CACAGAGTTT CACAGTGAAT 
360 
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) ATTGATAATG TCATTGACTC GGTGATCTGT AGCTCCTTGC ' TTGCATCCAT TTGCAGCCTG. < : 



CTTTCAATTG; CAGTGGACAG GTAGTTTAGT . ATCTTTCTATG CTCTCCAGTA 



480 



CCATAACATT 



n 5 . ATGACAGTTA AGCGGGTTGG' GATCAGCATA AGTTGTATCT GGGCAGCTTG CACGGTTTCA . 
GGCATTTTGT TCATCATTTA CTCAGATAGT AGTGCTGTCA TCATCTGCCT CATCACCATG ' 
]0 TTCTTCACCA TGCTGGCTCT CATGGCTTCT CTCTATGTCC ACATGTTCCT GATGGCCAGG 
. ■;. CTTCACATTA AGAGGATTGC TGTCCTCCCC GGCACTGGTG CCATCCGCCA AGGTGCCAAT 
.'/' 780^ GGGAA AAATtA6CTT GAC ^CTG ATTGGCGTCT TTGTTGTCTG CTGGGCCCCA 

15 ™; TTCCTCC "acttaatatt ctacatctct tgtcctcaga atccatatto 

ATGTCTCACT TTAACTTGTA TCTCATACTG ATCATGTGTA ATTCAATCAT CGATCCTCTG ■ 
^ . , ATTTATGCAC TCCGGAGTCA AGAACTGAGG AAAACCTTCA AAGAGATCAT ' CTGTTGCTAT . 

CCCCTGGGAG GCCTTTGTGA CTTGTCTAGC AGATATTAA ' 
999 



(137) INFORMATION FOR SEQ ID NO: 136 

• 25 ' 



30 



35 



... (i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 332 amino acids - V 

(B) TYPE: amino acid 
.(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 136 : ■ 

Met Val Asn Ser Thr His ArgGly Met HisThr Ser.Leu His Leu Trp 

• :• . ■ : - 5 - • •'• 10 ..:,=■ ... . :■; is • • . 

Asn Arg Ser Ser Tyr Arg Leu His Ser Asn Ala Ser Glu Ser Leu Gly 

20 , . . 25 . •■ 3 o .. ' ■ • V 

Lys Gly Tyr Ser Asp Gly. Gly Cys Tyr- Glu Gin Leu Phe Val Ser Pro 



35 40. ... ....... 45 
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Glu Val Phe Val Thr Leu Gly Val lie Ser Leu Leu Glu Asn lie Leu 
50 55 60 

Val lie Val Ala lie Ala Lys Asn Lys _ Asn Leu. His Ser Pro Met Tyr 
65 70 75 80 

5 - Phe Phe lie Cys Ser Leu Ala Val Ala Asp Met Leu Val Ser Val Ser 

85 90 95 

Asn Gly Ser Glu Thr lie lie lie Thr Leu Leu Asn Ser Thr Asp Thr 
100 105 110 . 

Asp Ala Gin Ser Phe Thr -Val Asn lie Asp Asn Val lie Asp Ser Val 
10 115 .120 125 

lie Cys Ser Ser Leu Leu Ala Ser lie Cys Ser Leu Leu Ser lie Ala 
130 135 140 

Val Asp Arg Tyr Phe Thr lie Phe Tyr Ala Leu Gin Tyr His Asn lie 
145 150 155 160 

15 Met Thr Val Lys Arg Val Gly lie Ser lie Ser Cys lie Trp Ala Ala 

165 170 175 

Cys Thr Val Ser Gly lie Leu Phe lie lie Tyr Ser Asp Ser Ser Ala 
180 185 190 

Val lie lie Cys Leu lie Thr Met Phe Phe Thr Met Leu Ala Leu Met 
20 195 200 205 

Ala Ser Leu Tyr Val His Met Phe Leii Met Ala Arg Leu His lie Lys 
210 215 220 

Arg lie Ala Val Leu Pro Gly Thr Gly Ala lie Arg Gin Gly Ala Asn 
225 230 ' 235 240 

25 Met Lys Gly Lys lie Thr Leu Thr He Leu lie Gly Val Phe Val Val 

245 250 255 

Cys Trp Ala Pro Phe Phe Leu His Leu He Phe Tyr He Ser Cys Pro 
260 265 270 

Gin Asn Pro Tyr Cys Val Cys Phe Met Ser His Phe Asn Leu Tyr Leu 
30 275 280 285 

He Leu lie Met Cys Asn Ser lie He Asp Pro Leu He Tyr Ala Leu 
290 295 .300 

Arg Ser Gin Glu Leu Arg Lys Thr Phe Lys Glu lie He Cys Cys Tyr 
305 310 315 320 

35 ^ Pro Leu Gly Gly Leu Cys Asp Leu Ser Ser Arg Tyr 

325 330 



(138)* INFORMATION FOR SEQ ID NO: 137: 
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(i) SEQUENCE CHARACTERISTICS: ' * """:\" ' * 

. ; (A) LENGTH: 33 base pairs 
r ' . .. (B) TYPE: nucleic acid / ''" ' V ' : ' - ; . • 

(C) STRANDEDNESS: single . . 

5 : (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE:"' DNA (genomic) ' . 

, ; . , (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 137: " .• 

, GCCAATATGA AGG G AAAAAT, TACCTTGACC ATC \ V ' 

^ 33 . . ; - : ' , : ; • ■ • ■ . ' \. • • - \ . ' . , - ' • , 

10 '(137) INFORMATION FOR SEQ ID NO: 13 8: ' 

' (i) SEQUENCE CHARACTERISTICS: 

, (A) LENGTH:" 31 base pairs \ 
- 1(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single . / ' 

15 : . " (D) TOPOLOGY: linear . . , ' -'' . • . 

(ii) MOLECULE TYPE: DNA (genomic). ■ 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 138 : /. 
; v CTCCTTCGGT CCTCCTATCG TTGTCAGAAG T > 

' 3 i . . . • • ' . ; ' " ; • • ' : ■--< ' • ; . . . . •• ' 

10 (140)' INFORMATION FOR SEQ ID NO: 139: 

(i) SEQUENCE CHARACTERISTICS: . . 

(A)' LENGTH: 1842 base pairs • ; " : 

. (B) TYPE: nucleic acid : \ . 

(C) STRANDEDNESS: single . ' . ... 

• 5 ' (D) TOPOLOGY: linear ; " - / 

(ii) MOLECULE TYPE: DNA (genomic) 

' . (xi) SEQUENCE DESCRIPTION: SEQ ID NO:139: 

CCCTAGCGGT TCCCACCCCC TATGGCTGTA TTGGCTGTAA GCTACCCCAG 60 

CCAGAATACC CACCGGCTCT AATCATCTTT ATGTTCTGCG CGATGGTTAT CACCATCGTT 120 

30 GTAGACCTAA TCGGCAACTC CATGGTCATT TTGGCTGTGA- CGAAGAACAA GAAGCTCCGG 180 

AATTCTGGCA ACATCTTCGT GGTCAGTCTC TCTGTGGCCG ATATGCTGGT GGCCATCTAC 240 

CCATACCCTT TGATGCTGCA TGCCATGTCC ATTGGGGGCT GGGATpTGAG CCAGTTACAG; 300 

TGCCAGATGG TCGGGTTCAT CACAGGGCTG AGTGTGGTCG GCTCCATCTT CAACATCGTG. 360 
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GCAATCGCTA 


TCAACCGTTA 


CTGCTACATC 


TGCCACAGCC 


TCCAGTACGA 


ACGGATCTTC 


420 




AGTGTGCGCA 


ATACCTGCAT 


CTACCTGGTC 


ATCACCTGGA 


TCATGACCGT 


CCTGGCTGTC 


480 




CTGCCCAACA 


TGTACATTGG 


CACCATCGAG 


TACGATCCTC 


GCACCTACAC 


CTGCATCTTC 


540 




AACTATCTGA 


ACAACCCTGT 


CTTCACTGTT ACCATCGTCT GCATCCACTT 


CGTCCTCCCT 


600 


5 


CTCCTCATCG 


TGGGTTTCTG 


CTACGTGAGG 


AT PTG G l\ P P Tx 


AAGTGCTGGC 


GGCCCGTGAC 


660 




CCTGCAGGGC 


AGAATCCTGA 


CAACCAACTT 


a ptg i\ g gttp 


GCAATTTTGT 


AACCATGTTT 


720 




GTGATCTTCC 


TCCTCTTTGC 


AGTGTGPTfifi 

"•w X *J X X \3\3 




ACGTGCTCAC 


TGTCTTGGTG 


780 




GCTGTCAGTC 


CGAAGGAGAT 




aTPPPPRRPT 


GGCTTTATCT 


TGCAGCCTAC 


840 




TTGATAGCCT 


ACTTC AACAG 


ptgpptp^ip 




ACGGGCTCCT 


CAATGAGAAT 


900 


10 


TTCCGAAGAG 


AATACTGGAC 


CATCTTCCAT 


G p T 2i TP* r* P r* 


ACCCTATCAT ATTCTTCCCT 


960 




GGCCTCATCA 


GTGATATTCG 


TGAGATGCAG 


GAGGCCCGTA 






1020 




CATGCTCGCG 


ACCAAGCTCG 


TGAACAAGAC 


CGTGCCCATG 


CCTGTCCTGC 


TGTGGAGGAA 


1080 




ACCCCGATGA . 


ATGTCCGGAA 


TGTTCCATTA 


CCTGGTGATG 


CTGCAGCTGG 


CCACCCCGAC 


1140 




CGTGCCTCTG 


GCCACCCTAA 


GCCCCATTCC 


AGATCCTCCT 


CTGCCTATCG 


CAAATCTGCC 


1200 


15 


TCTACCCACC 


ACAAGTCTGT 


CTTTAG CCAC 


TCCAAGGCTG 


CCTCTGGTCA 


CCTGAAGCCT 


1260 




GTCTCTGGCC 


ACTCCAAGCC 


TGCCTCTGGT 


CACCCCAAGT 


CTGCCACTGT 


CTACCCTAAG 


1320 




CCTGCCTCTG 


TCCATTTCAA GGGTGACTCT 


GTCCATTTCA 


AGGGTGACtC 


TGTCCATTTC 


1380 




AAGCCTGACT CTGTTCATTT CAAGCCTGCT TCCAGCAACC 


CCAAGCCCAT 


CACTGGCCAC 


1440 




GATGTCTCTG 


CTGGCAGCCA 


CTCCAAGTCT 


GCCTTCAGTG . 


CTGCCACCAG 


CCACCCTAAA 


1500 


20 


CCCATCAAGC 


CAGCTACCAG 


CCATGCTGAG 


CCCACCACTG 


CTGACTATCC 


CAAGCCTGCC 


1560 




ACTACCAGCC 


ACCCTAAGCC 


CGCTGCTGCT 


GAGAAGCCTG 


AGCTCTCTGC 


GTCCCATTGC 


1620 




CCCGAGATCC 


CTGCCATTGC 


CCACCCTGTG 


tctgacgaca" 


GTGACCTCCC 


TGAGTCGGCC 


1680 




TCTAGCCCTG .CCGCTGGGCC 


CACCAAGCCT 


GCTGCCAGCC 


AGCTGGAGTC 


TGACACCATC 


1740 




GCTGACCTTC 


CTGACCCTAC ' 


TGTAGTCAGT 


ACCAGTACGA 


ATGATTACCA 


TGATGTCGTG 


1800 


25 


GTTGTTGATG 


TTGAAGATGA 


TCCTGATGAA 


ATGGCTGTGT 


GA 




1842 



(141) INFORMATION FOR SEQ ID NO: 140: 

(i) ' SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 613 amino acids 

(B) -TYPE: *amino"acid ^ 
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-'. ■ '"■/ . -109- V- • ':' ' ;'.■] 

. ' (C) STRANDEDNESS : ' ' : '• • . ''" 

. (D) TOPOLOGY: not relevant / •, . ■ . .• 

• (ii) MOLECULE TYPE: protein : v . v 7 .'' . : ' ' '' / ■ 

(xi) SE QUENCE DESCRIPTION: SEQ ID NO: 140 : ~~ 7- > ~ V. ^ . - ; - - 

, . ^ *™ Thr Leu Ma; val Pro .Thr. Pro : Tyr Gly. Cys lie- Gly Cys ^ 

• '; > • " ■: , • ■ .. ; . < X0 . :■• v ! 15 ; 

■ : . Lys Leu Pro Gin . Pro Glu Tyr Pro Pro ai * t<*„ -n , ' 

20 . 7 - Pro Ala Leu lie He Phe Met Phe 

V . ' • / ■ -. - '. : ' . 25 . , [ . -;. 30 - 

: ! Thx lie. Vai VI Aep ^ Ili sly As„ Ser ^ 

. : Val ^ ^: ^ ^^^ ^ n Lys Lys Leu Arg Asn S er Gly Asn ■ 

: II. Phe Val Valuer Leu Ser Val „. Asp „ it Leu v „l Ala He Tyr 

' • * • ' . . . 75 . , ' 80 . 

Pro Tyr Pro Leu Met Leu Hi, Ma Met Ser lie Gly Gly Trp Asp Leu ' 

- / ; < . • 95 . 

Ser Gin Leu Gin Cys Gin Met Val Gly Phe . lie Thr Gly Leu Ser Val ' 

105 -.110 . 

Val Gly Ser . lie Phe Asn He Val Ala He AT* ti- i ' 

115 Aia He Aia He Asn Arg Tyr Cys 

; . 120 \ . - ■ 125 ' ■ 

- Tyr lie Cy a His ser Leu W Tyr clu Ar<, u e p he ser V al W «n 

''"> ' . ' ' " " ; - 135 . - - . 140 : ; 1 , 

Thr Cys lie Tyr Leu Val lie Thr Trp lie Met Thr Val Leu Ala Val . 

^ :, -'■ 155 160 

• Leu-Pro-Asn Met Tyr He Gly Thr lie Glu Tyr Asp Pro Arg' Thr Tyr 

- 5 :■ 170 ,: its 

Thr Cys. lie Phe Asn Tyr Xeu Asn Asn Pro Val Phe Thr - Val Thr He " 

; 180 "5, ■ ■■ 190 ■ 

Val Cys lie His phe Val Leu Pro Leu. .Leu lie Val Gly Phe Cys Tyr : 

200 ^ 205 

Val 4i? ^ ^^Val. Le uAlaMa.A rg:A sp p r o A la G ly Gln ' 

' • 215 ' 220 - ■ 

AsnProAspAsn Gin Leu Ala Glu Var Arg Asn Phe Leu Thr Met Phe ' 



230 . 235 
V.1 lie Phe Leu Leu Phe Ala Val Cys Trp Cys Pro lie Asn Val Leu 
245 . '250 255 



WO 00/22131 



PCT/US99/24065 



-110- 

Thr Val Leu Val Ala Val Ser Pro Lys Glu Met Ala Gly Lys lie Pro 
260 265 270 

Asn Trp Leu Tyr Leu Ala Ala Tyr Phe lie Ala Tyr Phe Asn Ser Cys 
275 280 285 

Leu Asn Ala Val lie Tyr Gly Leu Leu Asn Glu Asn Phe Arg Arg Glu 
290 . 295 - . 300 

Tyr Trp Thr lie Phe His Ala Met Arg His Pro lie lie Phe Phe Pro 
305 310 315 320 
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Gly Leu lie Ser Asp lie Arg Glu Met Gin Glu Ala Arg Thr Leu Ala 
325 330 .335 



Arg Ala Arg Ala His Ala Arg Asp Gin Ala Arg Glu Gin Asp Arg Ala 
340 345 350 

His Ala Cys Pro Ala Val " Glu Glu Thr Pro Met - Asn Val Arg Asn Val 
355 360 365 



15 



Pro Leu Pro Gly Asp Ala Ala Ala Gly His Pro Asp Arg Ala Ser Gly 
370 375 380 



His Pro Lys Pro His Ser'Arg Ser Ser Ser Ala Tyr Arg Lys Ser Ala 
385 390 395 400 



20 



Ser Thr His His Lys Ser Val Phe Ser His Ser Lys Ala Ala Ser Gly 
405 410 415 



His Leu Lys Pro Val Ser Gly His Ser Lys Pro Ala Ser Gly His Pro 
. 420 425 430 

Lys Ser Ala Thr Val Tyr Pro Lys Pro Ala Serial His Phe Lys Gly 
435 440 445 

25 Asp Ser Val His Phe Lys Gly Asp Ser Val His Phe Lys Pro Asp Ser 

450 455 460 

^Val His Phe Lys Pro Ala Ser Ser Asn Pro Lys Pro lie Thr Gly His 
465 470 475 480 



30 



His Val Ser Ala Gly Ser His. Ser Lys ' Ser Ala Phe Ser Ala Ala Thr 
485 490 495 



Ser His Pro Lys Pro. lie Lys Pro Ala Thr Ser His Ala Glu Pro Thr 
500 505 510 

Thr* Ala' Asp Tyr Pro Lys Pro Ala Thr Thr Ser His Pro Lys Pro Ala 
515 520 525 

35 Ala-Ala Asp Asn. Pro Glu Leu Ser. Ala Ser His Cys Pro Glu lie Pro 

530 535 540 



Ala lie Ala His' Pro "Val ' Ser' Asp Asp Ser Asp Leu Pro Glu Ser Ala 
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545 " 



55tf 



555 



560 



A1 ^laGly P r oThr Lys pro Ala Ala Ser Gin Leu Glu 
:„■- . :. : 570 : • 575 / 

Ser As p.^ r ne A la AS p Pro Asp Pro Thr ValVal Thr Thr S er 

Thr Asn Asp TVr His Xsp Val Val ' Val Val Asp Va! G lu Asp;Asp Pro" 



595 

Asp Glu Met Ala Val 
610 



600 



605 



10 (142) INFORMATION FOR SEQ ID NO-141: 

. '. . . . (i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1842 base pairs 

(B) TYPE: nucleic acid, 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



15 



60 
120 

ieo 

240 
300 



• . (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 141 : . 
' ^^CCCA CCCTAGCGGT TCCCA^ GCTACCCCAG 
" CCAGAATACC CACCGGCTCT AATCATCTTT ATGTTCTGCG CGATGGTTAT CACCATCGPT 
20 GTAGACCTAA TCGGCAACTC CATGGTCATT TTGGCTGTGA CGAAGAACAA GAAGCTCCGG 
AATTCTGGCA ACATCTTCGT GGTCAGTCTC TCTGTGGCCG ATATGCTGGT GGCCATCTAC 
• CCATACCCTT TGATGCTGCA TGCCATGTCC ATTGGGGGCT GGGATCTGAG CCAGTTACAG 

TGCCAGATGG TCGGGTTCAT CACAGGGCTG AGTGTGGTCG GCTCCATCTT CAACATCGTG ^ 360 
; GCAATCGCTA TCAACCGTTA CTGCTACATC TGCCACAGCC TCCAGTACGA ACGGATCTTC 420 
25 AGTGTGCGCA ATACCTGCAT CTACCTGGTC ATCACCTGGA TCATGACCGT CCTGGCTGTC 
' CTGCCCAACA TGTACATTGG CACCATCGAG TACGATCCTC GCACCTACAC CTGCATCITC 
AACTATCTGA ACAACCCTGT CTTCACTGTT ACCATCGTCT GCATCCACTT CGTCCTCCCT 60 0 
CTCCTCATCG TGGGTTTCTG CTACGTGAGG ATCTGGACCA- AAGTGCTGGC GGCCCGTGAC 660 
CCTGCAGGGC AGAATCCTGA CAACCAACTT GCTGAGGTTC GCAATAAACT AACCATGTTT ' " 720 
30 GTGATCTTCC TCCTCTTTGC AGTGTGCTGG TGCCCTATCA ACGTGCTCAC TGTCTTGGTG 780 



480 
540 
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TTCATAGCCT ACTTCAACAG CTGCCTCAAC GCTGTGATGT ACGGGCTCCT CAATGAGAAT 900 

TTCCGAAGAG AATACTGGAC CATCTTCCAT GCTATGCGGC ACCCTATCAT ATTCTTCTCT 960 

GGCCTCATCA GTGATATTCG TGAGATGCAG GAGGCCCGTA CCCTGGCCCG CGCCCGTGCC 1020 

CATGCTCGCG ACCAAGCTCG TGAACAAGAC CGTGCCCATG CCTGTCCTGC TGTGGAGGAA 1080 

ACCCCGATGA ATGTCCGGAA TGTTCCATTA . CCTGGTGATG • CTGCAGCTGG CCACCCCGAC 1140 

CGTGCCTCTG GCCACCCTAA GCCCCATTCC AGATCCTCCT CTGCCTATCG CAAATCTGCC 1200 

TCTACCCACC ACAAGTCTGT CTTTAGCGAC TCCAAGGCTG CCTCTGGTCA CCTCAAGCCT 1260 

GTCTCTGGCC ACTCCAAGCC TGCCTCTGGT CACCCCAAGT CTGCCACTGT CTACCCTAAG 1320 

CCTGCCTCTG TCCATTTCAA GGCTGACTCT GTCCATTTCA AGGGTGACTC TGTCCATTTC 1380 

AAGCCTGACT CTGTTCATTT CAAGCCTGCT TCCAGCAACC CCAAGCCCAT CACTGGCCAC 1440 

CATGTCTCTG CTGGCAGCCA CTCCAAGTCT GCCTTCAATG CTGCCACCAG CCACCCTAAA 1500 

CCCATCAAGC CAGCTACCAG CCATGCTGAG CCCACCACTG CTGACTATCC CAAGCCTGCC 1560 

ACTACCAGCC ACCCTAAGCC CGCTGCTGCT GACAACCCTG AGCTCTCTGC CTCCCATTGC 1620 

CCCGAGAtCC CTGCCATTGC CCACCCTGTG TCTGACGACA GTGACCTCCG TGAGTCGGCC -1680 

TCTAGCCCTG CCGCTGGGCC CACCAAGCCT GCTGCCAGCC AGCTGGAGTC TGACACCATC ' 1740 

GCTGACCTTC CTGACCCTAC TGTAGTCACT ACCAGTAGCA ATGATTACCA TGATGTCGTG M800. 

GTTGTTGATG TTGAAGATGA TCCTGATGAA ATGGCTGTGT GA ,1842 
(143) INFORMATION FOR SEQ ID. NO: 142: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 613 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:142:' 

Met Gly Pro Thr Leu Ala Val Pro Thr Pro Tyr Gly Cys He Gly Cys 

1 5 *\ - ,10-. t , ; 15 

Lys Leu Pro Gin Pro Glu Tyr Pro Pro Ala Leu He He Phe Met Phe 
- ■ 20 25 30 

Cys Ala Met-Val lie Thr He Val Val Asp' Leu He Gly Asn Ser Met . 
35 40 45 
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■:• - . ' ' -H3- [ _ .. ,-'; ; ':"V;-. ; ; . . ' : 

: : • - « » Thr V ^ ,, s , ys ^ w SK gi . ^ ■ 

' • • ^ .' • , . . : : • - : 60 . ■ . : % t . •' ■ ; 

lie- Phe Val Val c 0 v ; T - ' • " . • ' • v"" -;. : - 

. « - - > Serv>: s: , ... r? Mct ^ vj , ^ i;e • 

: •■ .■ ■ ■•• , . ■ • - •: ' :. ' ■ • 110 • 

Val Gly ser lie Phe Asn lie v*l n T1 ■' 

■ 115 V 120 ^ ^ Ile Asn >rg ^ cys 

■ - ' - ■ •-• - . • . • : • •-. - 125 . . ■ ..' . 
~ - cys H is set ^ ^ ^ jtg :u ^ ^ ^ 

- . ' • '• 140 • 

Leu Pro Asn Met Tyr ih r ,„ „. ' • _ 

, Oly Thr II. Qlu Tyr A sp Pro Arg Thr Tyr ! 

■ 175. • 

Thr Cys lie Phe Asn t 

Asn. ^ Leu Asn Asn Pro Val 

Val Cys lie His Phe . Val Leu t , 

• 195 ]Jo 6U 116 Val ««■ Cys Tyr 

. . • ' : .205 : 

, ~ Irp ^ ^ v., ^ M . Ma ^ asp pto au 

• : ■' • - : . ■ . ■ •/ 220 

, £ - *p to 01n ^ Ala Glu Val>r3 — ' ^ ^ . V 

' 4 ■ 235 ■- 2.40 

Val - Phe - - - - v, , ., Pro ::e asn vai ^ . 

^- ' . 5p •:■ ■ ■ 255/ : ■ ■ 

Thr Val Leu Val Ala Va i o ^ 

j6o .»la.v al Ser pr G lu ^ Sla ^.^ „ e w . 

• "-' . . . • "270- 

" Trp - ^ - - - ^ P , e 

' • ' . • • + - •■ ■ V'' ' ■ ,285 , 

Leu Asn Ala Val He Tvr m " , 

™ : . ^ « y ,«„ , 0 , A5n 01u ?ho 

315 . 320 



Gly Leu lie Ser Asp lie j r „ ^ . ' ' '• 

325 X9 G1U M6t Gln Ala. Arg Thr Leu 

a '■: -^vv-;;, , 330 , 335 

Arg^Ala Arg Ala His Ala Arg Asd Gln-ai « ^-^ 

_._ - ., ; , 7 - - - g- Asp Gin Ala Arg Glu Gin Asp Arg 



Ala 
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340 345 ; . 350 

His Ala Cys Pro Ala Val Glu Glu Thr Pro Met Asn Val Arg Asn Val 
355 360 365 

Pro Leu Pro Gly Asp Ala Ala Ala Gly His Pro Asp Arg Ala Ser Gly 
5 370 , 375 < 380 

His Pro Lys Pro His Ser Arg Ser Ser Ser Ala Tyr Arg Lys Ser Ala 
385 390 395 400 

Ser. Thr His His Lys Ser Val Phe Ser His Ser Lys Ala Ala Ser Gly 
405 410 . 415 

10 His Leu Lys Pro Val Ser Gly His Ser Lys Pro Ala Ser Gly His Pro 

420 425 430 

Lys Ser Ala Thr Val Tyr Pro Lys Pro Ala Ser Val His Phe Lys Ala 
435 440 . 445 

Asp Ser Val His Phe Lys Gly Asp Ser Val His . Phe Lys Pro Asp Ser 
15 450 455 . ; 460 

Val His Phe Lys Pro Ala Ser Ser Asn Pro Lys Pro He Thr Gly His 
465 470 475 480 

His Val Ser Ala Gly Ser His Ser Lys Ser Ala Phe Asn Ala Ala Thr 
485 490 495 

20 Ser His Pro Lys Pro lie Lys Pro Ala Thr Ser His Ala Glu Pro Thr 

500 505 510 

Thr Ala Asp Tyr Pro Lys Pro Ala Thr Thr Ser His Pro Lys Pro Ala 
515 520 . . . 525 

* Ala Ala Asp Asn Pro Glu Leu Ser Ala Ser His Cys Pro Glu lie Pro 
25 \ 530 . 535. , 540 

Ala lie Ala His Pro Val Ser Asp Asp Ser Asp Leu Pro Glu, Ser Ala 
545 550 . 555 560 

Ser Ser Pro Ala Ala Gly Pro Thr Lys Pro Ala Ala Ser Gin Leu Glu 

. . 565 r 570 575 

30 Ser Asp Thr lie Ala Asp Leu Pro Asp Pro Thr Val Val Thr Thr Ser 

580 . 585 - 590 

Thr Asn Asp Tyr His Asp Val Val Val Val Asp Val Glu Asp Asp Pro 
595 600 * 605 

Asp Glu Met Ala Val . 
35 610 

(144) INFORMATION FOR SEQ ID NO:143': 



10 



15 
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.,' U) SEQUENCE CHARACTERISTICS. ' 

- ■■ (A) LENGTH : 33 base pairs : 
■ , , ■ B> TYPE: nucleic. acid 

C) STRANDEDNESS: single 
(D) TOPOLOGY:' linear • . ..; 

(ii) MOLECULE TYPE: DNA (genomic) • : 

^ ; (Xi) SEQUENCE DESCRIPTION: SEQ ID NO : 143 
GCTGAGGTTC GCAATAAACT AACCATGTTT GTG . ' ' ; ' 
; (145) INFORMATION FOR SEQ ID NO:14 4: 

(i) SEQUENCE CHARACTERISTICS • ' 
(A). LENGTH: 30 base pairs 
B TYPE: nucleic acid 
' (C) STRANDEDNESS: single 

,(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

U±) SEQUENCE ASCRIPTION: SEQ ID NO:144: ., / 
. CTCCTTCGGT CCTCCTATCG TTGTCAGAAG T 
.; (146) INFORMATION FOR SEQ ID NO:14 5: 

20 (i) SEQUENCE CHARACTERISTICS • 

A) . LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear - 

(ii) MOLECULE TYPE: DNA (genomic) ;> " 

(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0.145, ■ 
TTAGATATCG GGGCCCACCC TAGCGGT ■ ' 
(147) INFORMATION FOR SEQ .ID NO: 146 ; 

30 (i) SEQUENCE CHARACTERISTICS: 

A LENGTH. 29 base pairs 
' (B) TYPE: nucleic acid 

(C) STRANDEDNESS: sincrle ' 
... (D) TOPOLOGY: linear 

/ M ° LEPULE DNA genomic). 
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25 



33 •• 



31 



33 
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(iv) ANTI-SENSE: YES 

(xi) SEQUENCE DESCRIPTION: SEQIDNO:146: 
GGTACCCCCA CAGCCATTTC ATCAGGATC " 33 
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FURTHER INFORMATION CONTINUED FROM PCT/ISA/ 210 



1. Claims: 1-4 

A cDNA encoding a non-endogenous, constitutively activated 
version of a human G-protein-coupled receptor comprising 
hARE-3(F313K); the receptor encoded by said cDNA; a plasmid 
comprising said cDNA; and a host cell comprising said 
plasmid. 

2. Claims: 5-8 

A cDNA encoding a non-endogenous, constitutively activated 
version of a human G-protein-coupled receptor comprising 
hARE-4(V233K); the receptor encoded by said cDNA; a plasmid 
comprising said cDNA; and a host cell comprising said 
plasmid. 



3. Claims: 9-12 

A cDNA encoding a non-endogenous, constitutively activated 
version of a human G-protein-coupled receptor comprising 
hARE-5(A240K); the receptor encoded by said cDNA; a plasmid 
comprising said cDNA; and a host cell comprising said 
plasmid. 



4. Claims: 13-16 

A cDNA encoding a non-endogenous, constitutively activated . 
version of a human G-protein-coupled receptor comprising 
hGPCR14(L257K); the receptor encoded by said cDNA; a plasmid 
comprising said cDNA; and a host cell comprising said 
plasmid. 



5. Claims: 17-20 

A cDNA encoding a non-endogenous, constitutively activated 
version of a human G-protein-coupled receptor comprising 
hGPCR27(C283K); the receptor encoded by said cDNA; a plasmid 
comprising said cDNA; and a host cell comprising said 
plasmid. 



6. Claims: 21-24 

A cDNA encoding a non-endogenous, constitutively activated 
version of a human G-protein-coupled receptor comprising 
hARE-l(E232K); the receptor encoded by said cDNA; a plasmid 
comprising said cDNA; and a host cell comprising said 
plasmid. 



7 ■ Claims: 25-28 



•hARE-2.(G285K) ;' the recebtor " e "^ . 6Ce ^° r Compr1si ' n ^ ' 

comprising, said cDKA? C SS ? i- hS^?n y ^2Si d - C ? NAs '« plasmid 
plasmid. ; eeI1 comprising said 



8. Claims: 29-32 



hPPRl(L239K); the receptor JnSdSi h -"S^ com Pnsing 
comprising said. cDNA; and a Eos? cJf a P lasiI "' d 

plasmid. - ' ■ nost cell comprising said 



9. Claims: 33-36 



hG2A K232A); the- receptor enVSS s rec fP tor comprising 

comprising said cDNA; P an "£$ ceV™?, ^ 3 plasm?d 
plasmid. "u a nost cell comprising said 



10. Claims: 37-40 



hRUP3(L224K); the rece 1 6 receptor comprising 
comprising said c DNA • and a hn ^ + f v Sa cDNA : a Plasmid 
- Plasmid. ' nd a nost .cell comprising said 



11. Claims: 41-44 



hRUP5(A236K); the recep?or wSfi £ rec ^ comprising 
comprising said. cDNA; an^a nos ILj a plasmid 

plasmid. • «"u a nost cell comprising said 



12. Claims:, 45-48 



hRUP6(N267K); the receptor n !S feCe S t0r comprising, 
comprising said c DNA ; and a ho st 1 ? r S3ld ^ 3 plasmid 
plasmid. ' r,Q a "°st cell comprising said 



13. Claims: 49-52 
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A cDNA encoding a non-endogenous, constituti vely activated 
version of a human G-protein-coupled receptor comprising 
hRUP7(A302K) ; the receptor encoded by said cDNA; a plasmid 
comprising said cDNA; and a host cell comprising said 
plasmid. 

14. Claims: 53-56 ■ , 

A cDNA encoding a non-endogenous, constitutive!/ activated 
version of a human G-protein-coupled receptor comprising 
hCHN4(V236K) ; the receptor encoded by said cDNA; a plasmid 
comprising said cDNA; and a host cell comprising said 
plasmid.. . 

15. Claims: 57-60 . - 

A.cDNA encoding a non-endogenous, consti tutively activated 
version of a human G-protein-coupled receptor comprising 
hMC4(A244K); the receptor encoded by said cDNA; a plasmid 
comprising said cDNA; and a host cell comprising said 
plasmid. 



16. Claims: 61-64 

A cDNA encoding a non-endogenous, consti tutively activated 
version of a human G-protein-coupled receptor comprising 
hCHN3(S284K); the receptor encoded by said cDNA; a plasmid 
comprising said cDNA; and a host cell comprising said 
plasmid. . 



17. Claims: 65-68. 

A cDNA encoding a non-endogenous, consti tutively activated 
version of a human G-protein-coupled receptor comprising 
hCHN6(L352K); the receptor encoded by sajd cDNA; a plasmid 
comprising said cDNA; and a host cell comprising said 
plasmid. 

18. Claims: 69-72 

A cDNA encoding a non-endogenous, constituti vely activated 
version of a human G-protein-coupled receptor comprising 
hCHN8(N235K); the receptor encoded by said cDNA; a plasmid 
comprising said cDNA; and a host cell comprising said 
plasmid. 



19. Claims: 73-76 

A cDMA encoding a non-endogenous, consti tutively activated 
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version of a human G-proteiri-coupled receptor comprising 
'hH9( F236K); the receptor encoded by" saiy cDNA; a plasmid 
comprising; said cDNA; and a host cell comprising said 
plasmid. s 



20. Claims: 77-80 



A cDNA encoding a non-endogenous, consti tutively activated 
version of a human G-protein-coupTed ATI receptor selected 
from the group consisting of hATl( F239K) , hATl (N111A) , 
hATl(AT2K255IC3) and hATl (A243+); the receptor encoded by 
said cDNA; a plasmid comprising said cDNA; and a host cell 
comprising said plasmid. 



INTERNATIONAL SEARCH REPORT 



i...ormation on patent family members 



Inten nal Application No 

PCT/US 99/24065 



Patent document 
cited in search report 


Publication 
date 


Patent family 
member(s) 


Publication 
date 


WO 9721731 A 


. 19-06-1997 . 


us 


. 5750353 A 


12-05-1998 






AU 


715611 B 


03-02-2000 






AU 


• 1334397 A 


03-07-1997 






CA 


2239293 A. 


19-06-1997 






EP 


0869975 A 


14-10-1998 



W0 9838217 A 03-09-1998 AU 6343998 A 18-09-1998 



W0 9924569 A 20-05-1999 NONE 



